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Abstract 



This review paper is intended to give a useful guide for those who want to apply discrete 
wavelets in their practice. The notion of wavelets and their use in practical computing and 
various applications are briefly described, but rigorous proofs of mathematical statements 
are omitted, and the reader is just referred to corresponding literature. The multiresolution 
analysis and fast wavelet transform became a standard procedure for dealing with discrete 
wavelets. The proper choice of a wavelet and use of nonstandard matrix multiplication 
are often crucial for achievement of a goal. Analysis of various functions with the help of 
wavelets allows to reveal fractal structures, singularities etc. Wavelet transform of operator 
expressions helps solve some equations. In practical applications one deals often with the 
discretized functions, and the problem of stability of wavelet transform and corresponding 
numerical algorithms becomes important. After discussing all these topics we turn to prac- 
tical applications of the wavelet machinery. They are so numerous that we have to limit 
ourselves by some examples only. The authors would be grateful for any comments which 
improve this review paper and move us closer to the goal proclaimed in the first phrase of 
the abstract. 

1 Introduction 

Wavelets became a necessary mathematical tool in many investigations. They are used in those 
cases when the result of the analysis of a particular signal should contain not only the list of its 
typical frequencies (scales) but also the knowledge of the definite local coordinates where these 
properties are important. Thus, analysis and processing of different classes of nonstationary (in 
time) or inhomogeneous (in space) signals is the main field of applications of wavelet analysis . The 
most general principle of the wavelet construction is to use dilations and translations. Commonly 
used wavelets form a complete orthonormal system of functions with a finite support constructed 
in such a way. That is why by changing a scale (dilations) they can distinct the local characteristics 
of a signal at various scales, and by translations they cover the whole region in which it is studied. 
Due to the completeness of the system, they also allow for the inverse transformation to be done. 
In the analysis of nonstationary signals, the locality property of wavelets leads to their substantial 
advantage over Fourier transform which provides us only with the knowledge of global frequencies 
(scales) of the object under investigation because the system of the basic functions used (sine, 
cosine or imaginary exponential functions) is defined on the infinite interval^. However, as we 
shall see, the more general definitions and, correspondingly, a variety of forms of wavelets are 
used which admit a wider class of functions to be considered. According to Y. Meyer 0, "the 
wavelet bases are universally applicable: "everything that comes to hand", whether function or 
distribution, is the sum of a wavelet series and, contrary to what happens with Fourier series, the 
coefficients of the wavelet series translate the properties of the function or distribution simply, 
precisely and faithfully." 

The literature devoted to wavelets is very voluminous, and one can easily get a lot of references 
by sending the corresponding request to Internet web sites. Mathematical problems are treated 
in many monographs in detail (e.g., see |l|> H H, U). The introductory courses on wavelets can be 

1 The notion of a signal is used here for any ordered set of numerically recorded information about some processes, 
objects, functions etc. The signal can be a function of some coordinates, would it be the time, the space or any 
other (in general, n-dimensional) scale. 

2 Comparison of the wavelet transform with the so-called windowed (within the finite interval) Fourier transform 
will be briefly discussed below. 
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found in the books []5], || [7], . The nice review paper adapted for beginners and practical users with 
demonstration of wavelet transform of some signals was published in this journal about four years 
ago and attracted much attention. However continuous wavelets are mostly considered there 
whereas discrete wavelets are just briefly mentioned. This choice was dictated by the fact that the 
continuous wavelets admit somewhat more visual and picturesque presentation of the results of 
the analysis of a signal in terms of local maxima and sceleton graphs of wavelet coefficients with 
continuous variables. 

At the same time, the main bulk of papers dealing with practical applications of wavelet 
analysis uses discrete wavelets which will be our main concern here. This preference of discrete 
wavelets is related to the fact that widely used continuous wavelets are not, strictly speaking, 
orthonormal because they are both infinitely differentiable and exponentially decreasing at infinity 
what violates orthonormalization property whereas there is no such problem for discrete wavelets. 
That is why the discrete wavelets allow often for more accurate transform and presentation of 
the analyzed signal, and, especially, for its inverse transform after the compression procedure. 
Moreover, they are better adapted to communication theory and practice. These comments do 
not imply that we insist on use of only discrete wavelets for signal analysis . On the contrary, the 
continuous wavelets may provide sometimes more transparent and analytical results in modelling 
the signal analysis than the discrete wavelets. 

The choice of a specific wavelet , would it be discrete or continuous one, depends on the 
analyzed signal and on the problem to be solved. Some functions are best analyzed using one 
method or another, and it depends on the relative simplicity of the decomposition achieved. 
The researcher's intuition helps resolve this "miracle". As an analogy, the example with number 
systems is often considered. It is a matter of tradition and convenience to choose the systems with 
the base 10, 2 or e. However, the Roman number system is completely excluded if one tries to use 
it for multiplication. At the same time, different problems can ask for more or less efforts both in 
their solution and in graphical presentation depending on the system chosen, and our intuition is 
important there. 

The programs exploiting the wavelet transform are widely used now not only for scientific 
research but for commercial purposes as well. Some of them have been even described in books 
(e.g., see |TI|)- At the same time, the direct transition from pure mathematics to computer 
programming and applications is non-trivial and asks often for the individual approach to the 
problem under investigation and for a specific choice of wavelets used. Our main objective here 
is to describe in a suitable way the bridge that relates mathematical wavelet constructions to 
practical signal processing. 

The discrete wavelets look as strangers to all those accustomed to analytical calculations be- 
cause they can not be represented by analytical expressions (except of a simplest one) or by 
solutions of some differential equations, and instead are given numerically as solutions of definite 
functional equations containing rescaling and translations. Moreover, in practical calculations 
their direct form is not required even, and the numerical values of the coefficients of the functional 
equation are only used. Thus wavelets are defined by the iterative algorithm of the dilation and 
translation of a single function. This leads to a very important procedure called multiresolution 
analysis which gives rise to the multiscale local analysis of the signal and fast numerical algo- 
rithms. Each scale contains an independent non-overlapping set of information about the signal 
in form of wavelet coefficients, which are determined from an iterative procedure called fast wavelet 
transform. In combination, they provide its complete analysis and the diagnosis of the underlying 
processes. 

After such an analysis was done, one can compress (if necessary) the resulting data by omitting 
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some inessential part of the encoded information. This is done with the help of the so-called 
quantization procedure which commonly allocates different weights to various wavelet coefficients 
obtained. In particular, it helps erase some statistical fluctuations and, therefore, increase the 
role of dynamical features of a signal. It can however falsify the diagnostic if the compression 
is done inappropriately. Usually, the accurate compression gives rise to a substantial reduction 
of the required computer storage memory and transmission facilities, and, consequently, to a 
lower expenditure. The number of vanishing moments of wavelets is important at this stage. 
Unfortunately, the compression introduces unavoidable systematic errors. The mistakes one has 
made will consist of multiples of the deleted wavelet coefficients, and, therefore, the regularity 
properties of a signal play an essential role. Reconstruction after such compression schemes is 
then not perfect any more. These two objectives are clearly antagonistic. Nevertheless, when 
one tries to reconstruct the initial signal, the inverse transformation (synthesis) happens to be 
rather stable and reproduces its most important characteristics if proper methods are applied. 
The regularity properties of wavelets used become also crucial at the reconstruction stage. The 
distortions of the reconstucted signal due to quantization can be kept small, although significant 
compression ratios are attained. Since the part of the signal which is not reconstructed is often 
called noise, in essence, what we are doing is denoising signals. Namely at this stage the superiority 
of the discrete wavelets becomes especially clear. 

Thus, the objectives of signal processing consist in the accurate analysis , effective coding, 
fast transmission and, finally, careful reconstruction (at the transmission destination point) of the 
initial signal. Sometimes the first stage of signal analysis and diagnosis is enough for the problem 
to be solved and anticipated goals to be achieved. 

It has been proven that any function can be written as a superposition of admissible wavelets, 
and there exists a numerically stable algorithm to compute the coefficients for such an expansion. 
Moreover, these coefficients completely characterize the function, and it is possible to reconstruct 
it in a numerically stable way by knowing these coefficients. Because of their unique properties, 
wavelets were used in functional analysis in mathematics, in studies of (multi)fractal properties, 
singularities and local oscillations of functions, for solving some differential equations, for investi- 
gation of inhomogeneous processes involving widely different scales of interacting perturbations, 
for pattern recognition, for image and sound compression, for digital geometry processing, for 
solving many problems of physics, biology, medicine, technique etc (see the recently published 



books 1 10, 11, |T|, y| 14])- This list is by no means exhaustive. 



One should however stress that, even though this method is very powerful, the goals of wavelet 
analysis are rather modest. It helps us describe and reveal some features, otherwise hidden in a 
signal, but it does not pretend to explain the underlying dynamics and physics origin although it 
may give some crucial hints to it. Wavelets present a new stage in optimization of this description 
providing the best known representation of a signal. With the help of wavelets, we merely see 
things a little more clearly. To understand the dynamics, standard approaches introduce models 
assumed to be driving the mechanisms generating the observations. To define the optimality of 
the algorithms of wavelet analysis , some (still debatable!) energy and entropy criteria have been 
developed. They are internal to the algorithm itself. However, the choice of the best algorithm is 
also tied to an objective goal of its practical use, i.e., to some external criteria. That is why in 
practical applications one should submit the performance of a "theoretically optimal algorithm" 
to the judgements of experts and users to estimate its advantage over the previously developed 
ones. 

Despite a very active research and impressive results, the versatility of wavelet analysis and 
existence of many different inversion formulas for wavelet transform (redundancy) imply that these 
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studies are presumably not in their final form yet. We shall try to describe the situation in its 
status nascendi. 

The main part of this paper (Sections 2-14) is devoted to description of general properties of 
wavelets and use of the wavelet transform in computer calculations. Some applications to different 
fields are briefly described in Section 15. 



2 Wavelets for beginners 

Each signal can be characterized by its averaged (over some intervals) values (trend) and by its 
variations around this trend. Let us call these variations as fluctuations independently of their 
nature, would it be dynamic, stochastic, psychological, physiological or any other origin. When 
processing a signal, one is interested in its fluctuations at various scales because therefrom one can 
learn about their origin. The goal of the wavelet analysis is to provide tools for such processing. 

Actually, physicists dealing with experimental histograms analyze their data at different scales 
when averaging over different size intervals. This is a particular example of a simplified wavelet 
analysis treated in this Section. To be more definite, let us consider the situation when an 
experimentalist measures some function f(x) within the interval < x < 1, and the best resolution 
obtained with the measuring device is limited by l/16th of the whole interval. Thus the result 
consists of 16 numbers representing the mean values of f(x) in each of these bins and can be 
plotted as a 16-bin histogram shown in the upper part of Fig. 1. It can be represented by the 
following formula 

15 

f( x ) = IZ s 4,fc^4,fc(a;), (1) 

k=0 

where = /(&/16)/4, and ip4 jk is defined as a step-like block of the unit norm (i.e. of a height 4 
and a widths 1/16) different from zero only within the k-th bin. For an arbitrary j, one imposes the 
condition J dx\ipj tk \ 2 = 1, where the integral is taken over the intervals of the lengths Axj = 1/2 J 
and, therefore, ip^ have the following form ipj ik = 2^ 2 Lp[2^x — k) with ip denoting a step-like 
function of the unit height in such an interval. The label 4 is related to the total number of such 
intervals in our example. At the next coarser level the average over the two neighboring bins is 
taken as is depicted in the histogram just below the initial one in Fig. 1. Up to the normalization 
factor, we will denote it as s^ k and the difference between the two levels shown to the right of this 
histogram as d 3 k . To be more explicit, let us write down the normalized sums and differences for 
an arbitrary level j as 

Sj ~ 1 ' k = ~^/2 iSj ' 2k + Sj ' 2k+1 ^ dj-l,k = -^[Sj,2k - Sj,2fc+l], ( 2 ) 

or for the backward transform (synthesis) 

S j,2k = -^=(Sj-l,fc + 4/-l,k); Sj,2fc+1 = ~^{Sj-l,k - dj-l,k)- (3) 

Since, for the dyadic partition considered, this difference has opposite signs in the neighboring 
bins of the previous fine level, we introduce the function ip which is 1 and -1, correspondingly, in 
these bins and the normalized functions ip^ k = 2^ 2 ^{2^x — k). It allows us to represent the same 
function f(x) as 

7 7 

f( x ) = J2 S 3,kV3,k(x) + J2 d 3,k*p3,k( X )- ( 4 ) 
k=0 k=0 
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One proceeds further in the same manner to the sparser levels 2, 1 and with averaging done 
over the interval lengths 1/4, 1/2 and 1, correspondingly. It is shown in the subsequent drawings 
in Fig. 1. The most sparse level with the mean value of / over the whole interval denoted as so,o 
provides 

i 

f(x) = s 0fi ip 0fi (x) + d 0fi (x)ip 0fi (x) + di,kipi,ki x ) 

k=0 

3 7 

+ d2,k1p2,k 0) + d 3,k^3,k (x) ■ (5) 
fc=0 k=0 

The functions ipo t o(x) and ipo,o{x) are shown in Fig. 2. The functions (fj^{x) an d ipj,k(%) are the 
normalized by the conservation of the norm, dilated and translated versions of them. In the next 
Section we will give explicit formulae for them in a particular case of Haar scaling functions and 
wavelets. In practical signal processing, these functions (and their more sophisticated versions) 
are often called as the low and high-path filters, correspondingly, because they filter the large 
and small scale components of a signal. The subsequent terms in Eq. (||) show the fluctuations 
(differences dj t k) at finer and finer levels with larger j. In all the cases (II])— (0) one needs exactly 
16 coefficients to represent the function. In general, there are 2 J coefficients Sj t f- and 2- Jn — 2 J 
coefficients dj^, where j n denotes the finest resolution level (in the above example, j n = 4). 

All the above representations of the function f(x) (Eqs. (HI)-©) are mathematically equivalent. 
However, the latter one representing the wavelet analyzed function directly reveals the fluctuation 
structure of the signal at different scales j and various locations k present in a set of coefficients 
dj t k whereas the original form ([!]) hides the fluctuation patterns in the background of a general 
trend. The final form (|5]) contains the overall average of the signal depicted by So,o and all 
its fluctuations with their scales and positions well labelled by 15 normalized coefficients djk 
while the initial histogram shows only the normalized average values Sj ^ in the 16 bins studied. 
Moreover, in practical applications the latter wavelet representation is preferred because for rather 
smooth functions, strongly varying only at some discrete values of their arguments, many of the 
high-resolution (i-coefficients in relations similar to Eq. (||) are close to zero and can be discarded. 
Bands of zeros (or close to zero values) indicate those regions where the function is smooth enough. 

At first sight, this simplified example looks somewhat trivial. However, for more complicated 
functions and more data points with some elaborate forms of wavelets it leads to a detailed 
analysis of a signal and to possible strong compression with subsequent good quality restoration. 
This example provides also the illustration for the very important feature of the whole approach 
with successive coarser and coarser approximations to / called the multiresolution analysis and 
discussed in more detail below. 

3 Basic notions and Haar wavelets 

To analyze any signal, one should, first of all, choose the corresponding basis, i.e., the set of 
functions to be considered as "functional coordinates". In most cases we will deal with signals 
represented by the square integrable functions defined on the real axis (or by the square summable 
sequences of complex numbers). They form the infinite-dimensional Hilbert space L 2 (R) (l 2 (Z)). 
The scalar product of these functions is defined as 

/oo 
dxf(x)g(x), (6) 
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where the bar stands for the complex conjugate. For finite limits in the sum it becomes a finite- 
dimensional Hilbert space. Hilbert spaces always have orthonormal bases, i.e., families of vectors 
(or functions) e n such that 

(^n> ^m) $nm) (7) 

||/|| 2 = [dx\f(x)f = j:\(f,e n )\ 2 . (8) 

In the Hilbert space, there exist some more general families of linear independent basis vectors 
called a Riesz basis which generalize the equation (|) to 

<x\\f\\ 2 <Y.\(f,Zn)\ 2 <(3\\f\\ 2 (9) 

n 

with a > 0, (3 < oo. It is the unconditional basis where the order in which the basis vectors are 
considered does not matter. Any bounded operator with a bounded inverse maps an orthonormal 
basis into a Riesz basis. 

Sometimes we will consider the spaces L P (R)(1 < p < oo; p ^ 2), where the norm is defined 

as 

\\f\\ LP = {Jdx\f(x)\^, (10) 

as well as some other Banach spaces^ (see Sections 11, 12). 

The Fourier transform with its trigonometric basis is well suited for the analysis of stationary 
signals. Then the norm ||/|| is often called energy. For nonstationary signals, e.g., the location of 
that moment when the frequency characteristics has abruptly been changed is crucial. Therefore 
the basis should have a compact support. The wavelets are just such functions which span the 
whole space by translation of the dilated versions of a definite function. That is why every signal 
can be decomposed in the wavelet series (or integral). Each frequency component is studied with a 
resolution matched to its scale. The above procedure of normalization of functions (pj jk is directly 
connected with the requirement of conservation of the norm of a signal at its decompositions. 

The choice of the analyzing wavelet is, however, not unique. One should choose it in accordance 
with the problem to be solved. The simplicity of operations (computing, in particular) and of 
representation (minimum parameters used) plays also a crucial role. Bad choice may even prevent 
from getting any answer as in the above example with Roman numbers. There are several methods 
of estimating how well the chosen function is applicable to the solution of a particular problem 
(see Section 6). 

Let us try to construct functions satisfying the above criteria. "An educated guess" would be 
to relate a function <p(x) to its dilated and translated version. The simplest linear relation with 
2M coefficients is 

2M-1 

ip{x) = y/2 h k ip{2x-k) (11) 

k=0 

with the dyadic dilation 2 and integer translation k. At first sight, the chosen normalization of the 
coefficients h k with the factor \/2 looks somewhat arbitrary. Actually, it is defined a'posteriori 
by the traditional form of fast algorithms for their calculation (see Eqs. (0) and (|4lD below) 
and normalization of functions tpj^x), ipj,k{x)- It is used in all the books cited above. However, 
sometimes (see @, Chapter 7) it is replaced by c k = V^h k . 

3 Those are linear spaces equipped with norm which generally is not derived from a scalar product and complete 
with respect to this norm (L P (R)(1 < p < oo; p ^ 2) is a particular example of it.) 
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For discrete values of the dilation and translation parameters one gets discrete wavelets. The 
value of the dilation factor determines the size of cells in the lattice chosen. The integer M defines 
the number of coefficients and the length of the wavelet support. They are interrelated because 
from the definition of h k for orthonormal bases 

h k = V2 J dxy{x)(p{2x - k) (12) 

it follows that only finitely many h k are nonzero if <p has a finite support. The normalization 
condition is chosen as 

dx<p(x) = 1. (13) 



The function (p(x) obtained from the solution of this equation is called a scaling function^. If the 
scaling function is known, one can form a "mother wavelet " (or a basic wavelet ) if>(x) according 
to 



*P(x) = V2 ]T g k <p{2x-k), (14) 

k=0 

where 

g k = (-l) fe /i 2 M-fc-i- (15) 

The simplest example would be for M = 1 with two non-zero coefficients h k equal to l/v2, 
i.e., the equation leading to the Haar scaling function ip H (x): 

(p H {x) = tp H {2x) + tp H {2x-l). (16) 

One easily gets the solution of this functional equation 

ip H (x)=e(x)9(l-x), (17) 

where 9(x) is the Heaviside step-function equal to 1 at positive arguments and at negative ones. 
The additional boundary condition is ^#(0) = 1, ^#(1) = 0- This condition is important for 
the simplicity of the whole procedure of computing the wavelet coefficients when two neighboring 
intervals are considered. 
The "mother wavelet " is 

ifj H {x) = 6(x)6(l - 2x) - 6{2x - 1)0(1 - x). (18) 

with boundary values defined as ^(0) = 1, ^#(1/2) = —1, t/>h(1) = 0. This is the Haar 
wavelet [|15|] known since 1910 and used in the functional analysis . Namely this example has been 
considered in the previous Section fot the histogram decomposition. Both the scaling function 
<Ph{x) an d the "mother wavelet " ipn{x) are shown in Fig. 2. It is the first one of a family of 
compactly supported orthonormal wavelets m^P '■ —i if>- It possesses the locality property 
since its support 2M — 1 = 1 is compact. 

The dilated and translated versions of the scaling function ip and the "mother wavelet" if) 

<p jik = 2 j / 2 ip{2 j x-k), (19) 

if) jtk = 2 j/2 ip(2 j x - k) (20) 



l lt is often called also a "father wavelet " but we will not use this term. 
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form the orthonormal basis as can be (easily for Haar wavelets) checked^. The choice of 2 J 
with the integer valued j as a scaling factor leads to the unique and selfconsistent procedure of 
computing the wavelet coefficients. In principle, there exists an algorithm of derivation of the 
compact support wavelets with an arbitrary rational number in place of 2. However, only for this 
factor, it has been shown that there exists the explicit algorithm with the regularity of a wavelet 
increasing linearly with its support. For example, for the factor 3 the regularity index only grows 
logarithmically. The factor 2 is probably distinguished here as well as in music where octaves 
play a crucial role. If the dilation factor is 2, then the Fourier transform of the "mother" wavelet 
is essentially localized between ir and 2n. However, for some practical applications, a sharper 
frequency localization is necessary, and it may be useful to have wavelet bases with a narrower 
bandwidth. The fractional dilation wavelet bases provide one of the solutions of this problem but 
there also exist other possibilities. 
The Haar wavelet oscillates so that 

dxip{x) = 0. (21) 

This condition is common for all the wavelets. It is called the oscillation or cancellation condition. 
From it, the origin of the name wavelet becomes clear. One can describe a "wavelet" as a function 
that oscillates within some interval like a wave but is then localized by damping outside this 
interval. This is a necessary condition for wavelets to form an unconditional (stable) basis. We 
conclude that for special choices of coefficients h k one gets the specific forms of "mother" wavelets 
and orthonormal bases. 

One may decompose any function / of L 2 (R) at any resolution level j n in a series 

/ = J2 S 3n,k ( Pjn,k + H d j,k^j,k- (22) 

k j>jn,k 

At the finest resolution level j n = j max only s-coefficients are left, and one gets the scaling-function 
representation 

f( x ) = S s o max ,Wj max ,k- (23) 

k 

In the case of the Haar wavelets it corresponds to the initial experimental histogram with finest 
resolution. Since we will be interested in its analysis at varying resolutions, this form is used as 
an initial input only. The final representation of the same data ( |2"2"D shows all the fluctuations in 
the signal. The wavelet coefficients Sj ik and dj >k can be calculated as 



s jik = J dxf(x)(p jtk (x), (24) 

dj,k = J dxf(x)ip jjk (x). (25) 

However, in practice their values are determined from the fast wavelet transform described below. 

In reference to the particular case of the Haar wavelet , considered above, these coefficients 
are often referred as sums (s) and differences (d), thus related to mean values and fluctuations. 

For physicists familiar with experimental histograms, it generalizes the particular example 
discussed in the previous Section. The first sum in (|22|) with the scaling functions (pj k shows the 



5 We return back to the general case and therefore omit the index H because the same formula will be used for 
other wavelets. 
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average^ values of / within the dyadic intervals [k2~i , (k + 1)2~ J ), and the second term contains 
all the fluctuations of the function / in this interval. They come from ever smaller intervals which 
correspond to larger values of the scale parameter j. One would say that it "brings into focus" the 
finer details of a signal. This touching in of details is regularly spaced, taking account of dimension 
- the details of dimension 2 _J are placed at the points k1~K At the lowest (most sparse) level jo 
the former sum consists of a single term with the overall weighted average (/) = Sj 0i k , where ko 
is the center of the histogram. The second sum in (p2[) shows fluctuations at all the scales. At 
the next, more refined level ji > jo, there are two terms in the first sum which show the average 
values of / within half-intervals with their centers positioned at k%, &2- The number of terms in 
the second sum becomes less by one term which was previously responsible for the fluctuations 
at the half-interval scale. The total number of terms in the expansion stays unchanged. Here 
we just mention that according to (^) the number of terms in each sum depends on a definite 
resolution level. Changing the level index by 1, we move some terms to another sum, and each of 
these representations are "true" representations of the histogram at different resolution levels. 

Formally a similar procedure may be done the other way round by going to the sparser resolu- 
tions j < jo- Even if we "fill out" the whole support of /, we can still keep going with our averaging 
trick. Then the average value of / diminishes, and one can neglect the first sum in fl2"2"|) in the L 2 
sense because its L 2 -norm (§) tends to zero. In the histogram example, it decreases as |(/)| oc iV _1 
and \(f)\ 2 oc iV~ 2 whereas the integration region is proportional to N, i.e., ||(/)|| 2 oc N^ 1 — > 
for iV — > oo. That is why only the second term in fl2~2"|) is often considered, and the result is 
often called as the wavelet expansion. It also works if / is in the space L P (R) for 1 < p < oo 
but it can not be done if it belongs to L X (R) or L°°(R). For example, if / = 1 identically, all 
wavelet coefficients dj^ are zero, and only the first sum matters. For the histogram interpretation, 
the neglect of this sum would imply that one is not interested in its average value but only in 
its shape determined by fluctuations at different scales. Any function can be approximated to a 
precision 2 J//2 (i.e., to an arbitrary high precision at j — > — oo) by a finite linear combination of 
Haar wavelets. 

The Haar wavelets are also suitable for the studies of functions belonging to L p spaces, i.e., 
possessing higher moments. 

Though the Haar wavelets provide a good tutorial example of an orthonormal basis, they suffer 
from several deficiences. One of them is the bad analytic behavior with the abrupt change at the 
interval bounds, i.e., its bad regularity properties. By this we mean that moments of the Haar 
wavelet are different from zero - only the integral (pl[) of the function itself is zero. The Haar 
wavelet does not have good time-frequency localization. Its Fourier transform decays like | c<^ | 1 
for uj — > oo. 

It would be desirable to build up wavelets with better regularity. The advantage of them 
compared to the Haar system shows up in the smaller number of the wavelet coefficients which are 
large enough to account for and in their applicability to a wider set of functional spaces besides 
L 2 . The former feature is related to the fact that the wavelet coefficients are significantly different 
from zero only near singularities of / (strong fluctuations!). Therefore wavelet series of standard 
functions with isolated singularities are "sparse" series in contrast to the Fourier series which are 
usually dense ones for rather regular functions. The latter feature allows us to get an access to 
local and global regularities of functions under investigation. The way to this program was opened 
by the multiresolution analysis . 



6 The averaging is done with the weight functions (pj > k(x). 
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4 Mult iresolut ion analysis and Daubechies wavelets 



The relation (22) shows that a general function / can be approximated by a sequence of very 
simple functions <fij,k, ipj,k- The above example has demonstrated that the Haar functions are 
local and cover the whole space L 2 (R) by using the translation k. They are orthogonal for different 
resolution scales j. The transition from j to j + 1 is equivalent to the replacement of x by 2x, i.e., 
to the rescaling which allows for the analysis to be done at various resolutions. 

However the Haar wavelets are oversimplified and not regular enough. The goal is to find a 
general class of those functions which would satisfy the requirements of locality, regularity and 
oscillatory behavior. Note that in some particular cases the orthonormality property sometimes 
can be relaxed. They should be simple enough in the sense that they are of being sufficiently 
explicit and regular to be completely determined by their samples on the lattice defined by the 
factors 2K 

The general approach which respects these properties is known as the multiresolution approx- 
imation. A rigorous mathematical definition is given in Appendix 1. Here we just describe its 
main ingredients. 

The multiresolution analysis consists of a sequence of successive approximation spaces Vj which 
are scaled and invariant under integer translation versions of the central functional space Vq. To 
explain the meaning of these spaces in a simple example, we show in Fig. 3 what the projections 
of some function on the Haar spaces Vq, V% might look like. One easily recognizes the histogram 
representation of this function. The comparison of histograms at the two levels shows that the 
first sum in Eq. (|22|) provides the "blurred image" or "smoothed means" of f(x) in each interval, 
while the second sum of this equation adds finer and finer details of smaller sizes. Thus the 
general distributions are decomposed into the series of correctly localized fluctuations having a 
characteristic form defined by the wavelet chosen. 

The functions tpj^ form an orthonormal basis in Vj. The orthogonal complement of Vj in Vj+i 
is called Wj. The subspaces Wj form a mutually orthogonal set. The sequence of ipj,k constitutes 
an orthonormal basis for Wj at any definite j. The whole collection of ipj,k and tpj t k for all j 
is an orthonormal basis for L 2 (R). This ensures us that we have constructed a multiresolution 
analysis approach, and the functions ipj,k and (fj^ constitute the small and large scale filters, 
correspondingly. The whole procedure of the multiresolution analysis can be demonstrated in 
graphs of Fig. 4. 

In accordance with the above declared goal, one can define the notion of wavelets (see Appendix 
1) so that the functions 2^ 2 ip[2 : >x — k) are the wavelets (generated by the "mother" ip), possessing 
the regularity, the localization and the oscillation properties. 

At first sight, from our example with Haar wavelets, it looked as if one is allowed to choose 
coefficients hk at his own wish. This impression is, however, completely wrong. The general prop- 
erties of scaling functions and wavelets define these coefficients in a unique way in the framework 
of the multiresolution analysis approach. 

Let us show how the program of the multiresolution analysis works in practice when applied to 
the problem of finding out the coefficients of any filter and They can be directly obtained 
from the definition and properties of the discrete wavelets. These coefficients are defined by the 
relations (0) and ([TJ) 

ip(x)=^Y l h k V$x-k)\ i/){x)=V2Y,9k<p(2x-k), (26) 
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where J2k \ hk\ 2 < oo. The orthogonality of the scaling functions defined by the relation 

dx(p(x)if(x — m) = 
leads to the following equation for the coefficients: 

k 

The orthogonality of wavelets to the scaling functions 

J dxi/j(x)(p(x — m) = 

gives the equation 

k 

having a solution of the form 

9k = ( — 1) h 2 M-l-k- 

Another condition of the orthogonality of wavelets to all polynomials up to the power 
defining its regularity and oscillatory behavior 

J dxx n tp(x) = 0, n = 0,...,(M-l), 

provides the relation 

E k n 9k = 0, 

k 

giving rise to 

J2(-i) k k n h k = 0, 



when the formula ( 31f) is taken into account. 
The normalization condition 

dxcp(x) = 1 



can be rewritten as another equation for h^: 

X> fe = V2. 

k 

Let us write down the equations (|28|),(|34|), ( |36D for M = 2 explicitly: 

h h 2 + hihi = 0, 
h - h 1 + h 2 - h 3 = 0, 
-hi + 2h 2 - 3h 3 = 0, 
ho + hi + h 2 + h 3 = V2. 

The solution of this system is 

h 3 = ^=(l±V3), h 2 = ^= + h 3 , h 1 = -L-h 3 , h = ^= 
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that, in the case of the minus sign for corresponds to the well known filter 



k ° = Vf (1 + V ^ } ' fcl= V5 (3 + V ^' h * = ^ 3 -^> fc8= 4^( 1 -^' (38) 

These coefficients define the simplest D 4 (or 2VO wavelet from the famous family of orthonormal 
Daubechies wavelets with finite support. It is shown in the upper part of Fig. 5 by the dotted line 
with the corresponding scaling function shown by the solid line. Some other higher rank wavelets 
are also shown there. It is clear from this Figure (especially, for D A ) that wavelets are smoother in 
some points than in others. The choice of the plus sign in the expression for h% would not change the 
general shapes of the scaling function and wavelet D 4 . It results in their mirror symmetrical forms 
obtained by a simple reversal of the signs on the horizontal and vertical axes, correspondingly. 
However, for higher rank wavelets different choices of signs would correspond to different forms of 
the wavelet . After the signs are chosen, it is clear that compactly supported wavelets are unique, 
for a given multiresolution analysis up to a shift in the argument (translation) which is inherently 
there. The dilation factor must be rational within the framework of the multiresolution analysis . 
Let us note that 2f is Holder continuous with the global exponent a = 0.55 (see Eq. ([53]) below) 
and has different local Holder exponents on some fractal sets. Typically, wavelets are more regular 
in some points than in others. 

For the filters of higher order in M, i.e., for higher rank Daubechies wavelets, the coefficients 
can be obtained in an analogous manner. It is however necessary to solve the equation of the 
M-th power in this case. Therefore, the numerical values of the coefficients can be found only 
approximately, but with any predefined accuracy. The wavelet support is equal to 2M — 1. It 
is wider than for the Haar wavelets. However the regularity properties are better. The higher 
order wavelets are smoother compared to D A as seen in Fig. 5. The Daubechies wavelet with r 
vanishing moments has fir continuous derivatives where fi ~ 0.2 as was estimated numerically. 
This means that 80% of zero moments are "wasted". As the regularity r increases, so does the 
support in general. For sufficiently regular functions, Daubechies wavelet coefficients are much 
smaller (2~ ^ times) than the Haar wavelet coefficients, i.e., the signal can be compressed much 
better with Daubechies wavelets. Since they are more regular, the synthesis is more efficient also. 

One can ask the question whether the regularity or the number of vanishing moments is more 
important. The answer depends on the application, and is not always clear. It seems that the 
number of vanishing moments is more important for stronger compression which increases for 
larger number of vanishing moments, while regularity can become crucial in inverse synthesis to 
smooth the errors due to the compression (omission of small coefficients). 



In principle, by solving the functional equation (11 ) one can find the form of the scaling function 
and, from (fl4|), the form of the corresponding "mother wavelet ". There is no closed- form analytic 
formula for the compactly supported <p(x), ip{x) (except for the Haar case). Nevertheless one can 
compute their plots, if they are continuous, with arbitrarily high precision using a fast cascade 
algorithm with the wavelet decomposition of (p(x) which is a special case of a refinement scheme 
(for more details, see @). Instead of a refinement cascade one can compute (p(2~ik) directly from 
Eq. ( |TT| ) starting from appropriate (p(n). However, in practical calculations the above coefficients 
hk are used only without referring to the shapes of wavelets. 

Except for the Haar basis, all real orthonormal wavelet bases with compact support are asym- 
metric, i.e., they have neither symmetry nor antisymmetry axis (see Fig. 5). The deviation of a 
wavelet from symmetry is judged by how much the phase of the expresion m^u) = Ylkhk e ~ lkui 
deviates from a linear function. The "least asymmetric" wavelets are constructed by minimizing 
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this phase. Better symmetry for a wavelet necessarily implies better symmetry for the coefficients 
hk but the converse statement is not always true. 

5 Fast wavelet transform and coiflets 

After calculation of the coefficients hk and g^, i.e., the choice of a definite wavelet , one is able to 
perform the wavelet analysis of a signal f(x) because the wavelet orthonormal basis (V^.fei ¥j,k) 
has been defined. Any function / e L 2 (R) is completely characterized by the coefficients of its 
decomposition in this basis and may be decomposed according to the formula (|22|). Let us make 
the sum limits in this formula more precise. The function f(x) may be considered at any n-th 
resolution level j n . Then the separation of its average values and fluctuations at this level looks 
like 

oo oo oo 

f( X ) = S 3n,k<Pjn,k( X ) + 12 12 d 3^jA X )- ( 39 ) 

k = — OO j = jn fc = — OO 

At the infinite interval, the first sum may be omitted as explained above, and one gets the pure 
wavelet expansion. As we stressed already, the coefficients Sj^ and dj t k carry information about 
the content of the signal at various scales. They can be calculated directly using the formulas 
(p4|), (|25|). However this algorithm is inconvenient for numerical computations because it requires 
many (N 2 ) operations where N denotes a number of the sampled values of the function. We will 
describe a faster algorithm. It is clear from Fig. 6, and the fast algorithm' formulas are presented 
below. 

In real situations with digitized signals, we have to deal with finite sets of points. Thus, there 
always exists the finest level of resolution where each interval contains only a single number. 
Correspondingly, the sums over k will get the finite limits. It is convenient to reverse the level 
indexation assuming that the label of this fine scale is j = 0. It is then easy to compute the 
wavelet coefficients for more sparse resolutions j > 1 . 

The multiresolution analysis naturally leads to an hierarchical and fast scheme for the compu- 
tation of the wavelet coefficients of a given function. The functional equations (|TTf) , fll4|) and the 
formulas for the wavelet coefficients fl2"4]), Q25D give rise, in case of Haar wavelets, to the relations 
(0), or for the backward transform (synthesis) to (|3|). 

In general, one can get the iterative formulas of the fast wavelet transform 

s j+l,k — 12 h"m.Sj t 2k+m, (40) 
m 

dj+l s k = 12 9m$j,2k (41) 
m 

where 

so,k = J dxf(x)(p(x - k). (42) 

These equations yield fast algorithms (the so-called pyramid algorithms) for computing the wavelet 
coefficients, asking now just for 0(N) operations to be done. Starting from s ,fc> one computes all 
other coefficients provided the coefficients h m , g m are known. The explicit shape of the wavelet 
is not used in this case any more. The simple form of these equations is the only justification for 
introducing the factor \/2 in the functional equation (|TTD . In principle, the coefficients h m , g m 
could be renormalized. However, in practice the Eqs. ( |40"D and ( |4TD are used much more often 
than others, and this normalization is kept intact. After choosing a particular wavelet for analysis 
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,i.e., choosing h m , g m , one uses only the Eqs. ( fiOD and ( f4T| ) for computing the wavelet coefficients, 
and additional factors in these equations would somewhat complicate the numerical processing. 

The remaining problem lies in the initial data. If an explicit expression for f(x) is available, 
the coefficients So,fc may be evaluated directly according to (f£2j). But this is not so in the situation 
when only discrete values are available. To get good accuracy, one has to choose very small bins 
(dense lattice) which is often not accessible with finite steps of sampling. In such a case, the 
usually adopted simplest solution consists in directly using the values f(k) of the available sample 
in place of the coefficients So,fc and start the fast wavelet transform using formulas Q4"0|), (pETp. This 
is a safe operation since the pyramid algorithm yields perfect reconstruction, and the coefficient 
so,fc essentially represents a local average of the signal provided by the scaling function. 

In general, one can choose 

s o,k = c mf(k - m). (43) 

m 

The above supposition s ,fc = f(k) corresponds to c m = 5o m . This supposition may be almost 
rigorous for some specific choices of scaling functions named coiflets (by the name of R. Coif- 
man whose ideas inspired I. Daubechies to build up these wavelets). It is possible to construct 
multiresolution analysis with the scaling function having vanishing moments, i.e., such that 

dxx m (p(x) = 0; < m < M. (44) 

To construct such wavelets (coiflets), one has to add to the equations for determining the coeffi- 
cients hk a new condition 

h k k m = 0, < m < M, (45) 

k 



which follows from the requirement ([44]). 

Coiflets are more symmetrical than Daubechies wavelets as is seen from Fig. 7 if compared 
to Fig. 5. The latter do not have the property (^4|). The price for this extra generalization is 
that coiflets are longer than Daubechies wavelets. If in the latter case the length of the support 
is 2M — 1, for coiflets it is equal to 3M — 1. The error in the estimation of sj t k decreases with the 
number of vanishing moments as 0(2 _jA/ ). 

There are other proposals to improve the first step of the iterative procedure promoted by 



using T-polynomials by W. Sweldens |16| or the so-called "lazy" or interpolating wavelets by S. 
Goedecker and O. Ivanov []17|| . The latter is most convenient for simultaneous analysis at different 
resolution levels, in particular for non-equidistant lattices. 

Inverse fast wavelet transform allows for the reconstruction of the function starting from its 
wavelet coefficients. 



6 Choice of wavelets 

Above, we demonstrated three examples of discrete orthonormal compactly supported wavelets. 
The regularity property, the number of vanishing moments and the number of wavelet coefficients 
exceeding some threshold value were considered as possible criteria for the choice of a particular 
wavelet not to say about computing facilities. Sometimes the so-called information cost functional 
used by statisticians is introduced, and one tries to minimize it and thus select the optimal basis. 
In particular, the entropy criterium for the probability distribution of the wavelet coefficients is 



also considered pi 1 1 . The entropy of / relative to the wavelet basis measures the number of 
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significant terms in the decomposition (P^fl. It is defined by exp(— J2j,k Mj,fc| 2 log \dj^\ 2 )- If we 
have a collection of orthonormal bases, we will choose for the analysis of / the particular basis 
that yields the minimum entropy. 

The number of possible wavelets at our disposal is much larger than the above examples show. 
We will not discuss all of them just mentioning some and referring the reader to the cited books. 

• First, let us mention splines which lead to wavelets with non-compact support but with the 
exponential decay at infinity and with some (limited) number of continuous derivatives. The 
special orthogonalization tricks should be used there. The splines are intrinsically associated 
with interpolation schemes for finding more precise initial values of So,fc relating them to some 
linear combinations of the sampled values of f(x). 

• To insure both the full symmetry and exact reconstruction, one has to use the so-called 
biorthogonal wavelets. It means that two dual wavelet bases ipj jk and ifj^ k , associated with 
two different multiresolution ladders, are used. They can have very different regularity 
properties. A function / may be represented in two forms, absolutely equivalent until the 
compression is done: 

/ !:(/•'•>)'•••'•' (46) 

3,k 

= Ett^i* (47) 

j,k 

where tp and its dual wavelet satisfy the biorthogonality requirement (ipj^l^ ' k ) — 3j,k;j',k'- 
In distinction to Daubechies wavelets, where regularity is tightly connected with the number 
of vanishing moments, biorthogonal wavelets have much more freedom. If one of them has the 
regularity of the order r, then its dual partner wavelet has automatically at least r vanishing 
moments. If ipi' is much more regular than ipj,k, then ipj^ has many more vanishing moments 
than ijji ,k . It allows us to choose, e.g., very regular ip^ k and get many vanishing moments 
of ipj t k- The large number of vanishing moments of ipj ^ leads to better compressibility for 
reasonably smooth /. If the compression has been done, the formula ( flEf ) is much more useful 
than (f|7]). The number of significant terms is much smaller, and, moreover, better regularity 
of il)i )k helps reconstruct / more precisely. Biorthogonal bases are close to an orthonormal 
basis. Both wavelets can be made symmetric. Symmetric biorthogonal wavelets close to 
orthonormal basis are close to coiflets. The construction of biorthogonal wavelet bases is 
simpler than of orthonormal bases. 

• The existence of two-scale relations is the main feature of the construction of wavelet packets. 
The general idea of the wavelet packets is to iterate further the splitting of the frequency 
band, still keeping the same pair of filters. The scaling function introduced above acquires 
the name wq, and the packet is built up through the following iterations 

w 2n {x) = E h k w n (2x - k), (48) 

k 

w 2n +i(x) = ^g k w n (2x - k). (49) 

k 

The usual mother wavelet is represented by W\ . This family of wavelets forms an orthonormal 
basis in L 2 (R) which is called the fixed scale wavelet packet basis. Fig. 8 demonstrates the 
whole construction. 
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One can abandon the orthonormality property and construct non-orthogonal wavelets called 
frames. An important special class of frames is given by the Riesz bases of L 2 (R). A Riesz 
basis is a frame, but the converse is not true in general. The frames satisfy the following 
requirement: 

A\\f\\ 2 <£!</, ^>| 2 <£|mi 2 . (50) 

The constants A and B are called the frame bounds. For A = B one calls them as tight 
frames. The case A = B = 1 corresponds to orthonormal wavelets. 

When acting by (singular) operators, one sometimes gets infinities when usual wavelets are 
considered. Some suitable function b(x) may be used to specify the extra conditions which 
will be necessary and sufficient for the result of the (singular integral) operator action to be 
continuous in L 2 . In this case one chooses the so-called "wavelets which are adapted to 6". 
Any function / is again decomposed as 

f(x) = J2a(X)^\x) (51) 

A 

but the wavelet coefficients are now calculated according to 

a x = J dxb(x)f(x)ip x b) (x). (52) 
They satisfy the normalization condition 

'dxb(x)^\x)i/;f)(x) = Sx,x'. (53) 
The cancellation condition now reads 

dxb(x)i)[ b \x) = 0. (54) 



As we see, the cancellation is also adapted to the function b (in general, to the "complex 
measure" dxb(x)). 

Up to now we considered the wavelets with the dilation factor equal 2. It is most convenient 
for numerical calculations. However, it can be proved JIB, |2|] that within the framework of a 



multiresolution analysis , the dilation factor must be rational, and no other special require- 
ments are imposed. Therefore one can construct schemes with other integer or fractional 
dilation factors. Sometimes they may provide a sharper frequency localization. For wavelets 
with the dilation factor 2, their Fourier transform is essentially localized within one octave 
between ir and 2tt, whereas the fractional dilation wavelet bases may have a bandwidth 
narrower than one octave. 

Moreover, one can use the continuous wavelets as described at some length in Ref. [[J. The 
wavelet is written in a form 

iM*) = \a\- 1/2 i>(—). (55) 
a 

The direct and inverse formulas of the wavelet transform look like 

W„ b = H- 1 / 2 ( dx/(a;M— ), (56) 
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Here 



f(x) = C^ f^-Wa^ix). (57) 



Herefrom one easily recognizes that the oscillation of wavelets required by Eq. (|21|) is a 
general property. The vanishing Fourier transform of a wavelet at u — > 0, which is just 
directly the condition (pip, provides a finite value of in fl58p. One of the special and 
often used examples of continuous wavelets is given by the second derivative of the Gaussian 
function which is called the Mexican hat (MHAT) wavelet for its shape. Actually, it can 
be considered as a special frame as shown by Daubechies. The reconstruction procedure 
(synthesis) is complicated and can become unstable in this case. However it is widely 
applied for analysis of signals. The formula ([56]) is a kind of a convolution operation. That 
is why the general theory of the so-called Calderon-Zygmund operators @ (see Appendix 2) 
is, in particular, applicable to problems of the wavelet decomposition. 



7 Multidimensional wavelets 

The multiresolution analysis can be performed in more than 1 dimensions. There are two ways 
to generalize it to the two-dimensional case, for example, but we will consider the most often used 
construction given by tensor products. The tensor product method is a direct way to construct 
an r-regular multiresolution approximation which produces multidimensional wavelets of compact 
support. This enables us to analyze every space of functions or distributions in n dimensions 
whose regularity is bounded by r. 

The trivial way of constructing a two-dimensional orthonormal basis starting from a one- 
dimensional orthonormal wavelet basis ipj t k(x) = 2^ 2 tjj(2^x — k) is simply to take the tensor 
product functions generated by two one-dimensional bases: 

^jx,ki;hM( x ^ x 2) = ^hM^MhM^)- ( 59 ) 

In this basis the two variables Xi and x 2 are dilated independently. 

More interesting for many applications is another construction, in which dilations of the re- 
sulting orthonormal wavelet basis control both variables simultaneously, and the two-dimensional 
wavelets are given by the following expression: 

2 j V(2 j x - k, 2 j y - /), j, k,leZ, (60) 

where \& is no longer a single function: on the contrary, it consists of three elementary wavelets. 
To get an orthonormal basis of Wo one has to use in this case three families (p(x — k)tp(y — 
I), ip(x — k)<p(y — l), ip(x — k)ip(y — l). Then the two-dimensional wavelets are 2^(p{2Px — k)ip(2iy — 
I), 2^ r i\){2?x — k)(p(2?y — I), 2^ip(2Px — k)tp(2^y — /). In the two-dimensional plane, the analysis 
is done along the horizontal, vertical and diagonal strips with the same resolution in accordance 
with these three wavelets. 

Fig. 9 shows how this construction looks like. The schematic representation of this procedure 
in the left-hand side of the Figure demonstrates how the corresponding wavelet coefficients are 
distributed for different resolution levels (j = 1 and 2). The superscripts d,v and h correspond 
to diagonal, vertical and horizontal coefficients. In the right-hand side of the Figure, a set of 
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geometrical objects is decomposed into two layers. One clearly sees how vertical, horizontal and 
diagonal edges are emphasized in the corresponding regions. One should notice also that the 
horizontal strip is resolved into two strips at a definite resolution level. 

In the general n-dimensional case, there exist 2 n — l functions which form an orthonormal basis 
and admit the multiresolution analysis of any function from L 2 (R n ). The normalization factor is 
equal to 2 n ^ 2 in front of the function, as can be guessed already from the above two-dimensional 
case with this factor equal to 2 J in contrast to 2 J / 2 in one dimension. 

There exists also a method to form the wavelet bases which are not reducible to tensor products 
of one-dimensional wavelets (see 0]). In dimension 1, every orthonormal basis arises from a 
multiresolution approximation. In dimensions greater than 1, it is possible to form an orthonormal 
basis such that there is no r-regular multiresolution approximation (r > 1) from which these 
wavelets can be obtained |§. 



8 The Fourier and wavelet transforms 

In many applications (especially, for non-stationary signals), one is interested in the frequency 
content of a signal locally in time, i.e., tries to learn which frequencies are important at some 
particular moment. As has been stressed already, the wavelet transform is superior to the Fourier 
transform, first of all, due to the locality property of wavelets. The Fourier transform uses sine, 
cosine or imaginary exponential functions as the main basis. It is spread over the entire real axis 
whereas wavelet basis is localized. It helps analyze the local properties of a signal using wavelets 
while the Fourier transform does not provide any information about the location where the scale 
(frequency) of a signal changes. Decomposition into wavelets allows singularities to be located 
by observing the places where the wavelet coefficients are (abnormally) large. Obviously, nothing 
of the kind happens for the Fourier transform. Once the wavelets have been constructed they 
perform incredibly well in situations where Fourier series and integrals involve subtle mathematics 
or heavy numerical calculations. But wavelet analysis cannot entirely replace Fourier analysis , 
indeed, the latter is often used in constructing the orthonormal bases of wavelets needed for 
analysis with wavelet series. Many theorems of wavelet analysis are proven with the help of the 
Fourier decomposition. The two kinds of analysis are thus complementary rather than always 
competing. 

The Fourier spectrum f w of a one-dimensional signal f(t) having finite energy (i.e., square- 
integrable) is given by 

/oo 
f(t)e-^dt. (61) 

The inverse transform restores the signal 

1 r°° 

/(*) = 7T / f» e duj - ( 62 ) 

It is an unitary transformation 

/ \f(t)\ 2 dt=±-J\f UJ \ 2 du;. (63) 
This is the so-called Parseval identity which states the conservation of energy between the time 



and the frequency domains. The formula (plf ) asks for information about the signal f(t) from both 



past and future times. It is suited for a stationary signal when the frequency u does not depend on 
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time. Thus its time-frequency band is well located in frequency and practically unlimited in time, 
i.e., Eq. ( |6TD gives a representation of the frequency content of / but not of its time-localization 
properties. Moreover, the signal f(t) should decrease fast enough at infinities the integral fl6l|) to 
be meaningful. 

The attempt to overcome these difficulties and improve time-localization while still using the 
same basis functions is done by the so-called windowed Fourier transform. The signal f(t) is 
considered within some time interval (window) only. In practice, one has to restrict the search 
for the optimal window to the easily generated windows. A simple way of doing so amounts to 
multiplying f(t) by a simplest compactly supported window function, e.g., by g w = 9(t—ti)8(tf—t), 
where 9 is the commonly used Heaviside step-function different from zero at positive values of its 
argument only, are the initial and final cut-offs of the signal (more complicated square- 

integrable window functions g, well concentrated in time, e.g., a Gaussian or canonical coherent 
states[], can be used as well). Thus one gets 



where ujq, to > are fixed, and m, n are some numbers which define the scale and location prop- 
erties. The time localization of the transform is limited now but the actual window is fixed 
by different functions in time and frequency which do not depend on the resolution scale and 
have fixed widths. Moreover, the orthonormal basis of the windowed Fourier transform can be 
constructed only for the so-called Nyquist density (corresponding to the condition u t = 2n; see 
Section 14), whereas there is no such restriction imposed on wavelets. At this critical value, frames 
are possible, but the corresponding functions are either poorly localized, or have poor regularity. 
This result is known as the Balian-Low phenomenon. In practical applications of the windowed 
Fourier transform, to achieve better localization one should choose uoto < 27r thus destroying 
orthonormality. 

The difference between the wavelet and windowed Fourier transforms lies in the shapes of 
the analyzing functions ip an d g- All g, regardless of the value of u, have the same width. In 
contrast, the wavelets ip automatically provide the time (location) resolution window adapted to 
the problem studied, i.e., to its essential frequencies (scales). Namely, let t ,5 and Uo,5 u be the 
centers and the effective widths of the wavelet basic function ip(t) and its Fourier transform. Then 
for the wavelet family ipj,k(t) (|20|) and, correspondingly, for wavelet coefficients, the center and 
the width of the window along the t-axis are given by 2 J (t + k) and 2^5. Along the cu-axis they 
are equal to 2~ J u;o and 2 -J 5 a; . Thus the ratios of widths to the center position along each axis do 
not depend on the scale. It means that the wavelet window resolves both the location and the 
frequency in fixed proportions to their central values. For the high-frequency component of the 
signal it leads to quite large frequency extension of the window whereas the time location interval is 
squeezed so that the Heisenberg uncertainty relation is not violated. That is why wavelet windows 
can be called as Heisenberg windows. Correspondingly, the low-frequency signals do not require 
small time intervals and admit wide window extension along the time axis. Thus wavelets localize 
well the low-frequency "details" on the frequency axis and the high-frequency ones on the time 

7 In quantum mechanics, they are introduced for quantizing the classical harmonic oscillator. In signal analysis 
, they are known by the name of Gabor functions. 




(64) 



In its discrete form it can be rewritten as 




(65) 
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axis. This ability of wavelets to find a perfect compromise between the time localization and the 
frequency localization by choosing automatically the widths of the windows along the time and 
frequency axes well adjusted to their centers location is crucial for their success in signal analysis 
. The wavelet transform cuts up the signal (functions, operators etc) into different frequency 
components, and then studies each component with a resolution matched to its scale providing 
a good tool for time-frquency (position-scale) localization. That is why wavelets can zoom in 
on singularities or transients (an extreme version of very short-lived high-frequency features!) in 
signals, whereas the windowed Fourier functions cannot. In terms of the traditional signal analysis 
, the filters associated with the windowed Fourier transform are constant bandwidth filters whereas 
the wavelets may be seen as constant relative bandwidth filters whose widths in both variables 
linearly depend on their positions. 

In Fig. 10 we show the difference between these two approaches. It demonstrates the constant 
shape of the windowed Fourier transform region and the varying shape (with a constant area) of the 
wavelet transform region. The density of localization centers is homogeneous for the windowed 
Fourier transform whereas it changes for the wavelet transform so that at low frequencies the 
centers are far apart in time and become much denser for high frequencies. 

From the mathematical point of view, it is important that orthonormal wavelets give good 
unconditional^ bases for other spaces than that of the square integrable functions, out-performing 
the Fourier basis functions in this respect. It is applied in the subsequent Sections to a character- 
ization of such functions using only the absolute values of wavelet coefficients. In other words, by 
looking only at the absolute values of wavelet coefficients we can determine to which space this 
function belongs. This set of spaces is much wider than in the case of the Fourier transform by 
which only Sobolev spaces^ can be completely characterized. 

As we mentioned already, the wavelet analysis concentrates near the singularities of the function 
analyzed. The corresponding wavelet coefficients are negligible in the regions where the function 
is smooth. That is why the wavelet series with a plenty of non-zero coefficients represent really 
pathological functions, whereas "normal" functions have "sparse" or "lacunary" wavelet series and 
easy to compress. On the other hand, Fourier series of the usual functions have a lot of non-zero 
coefficients, whereas "lacunary" Fourier series represent pathological functions. Let us note at the 
end that, nevertheless, the Fourier transform is systematically used in the proof of many theorems 
in the theory of wavelets. It is not at all surprising because they constitute the stationary signals 
by themselves. 

9 Wavelets and operators 

The study of many operators acting on a space of functions or distributions becomes simple when 
suitable wavelets are used because these operators can be approximately diagonalized with respect 
to this basis. Orthonormal wavelet bases provide the unique example of a basis with non-trivial 
diagonal, or almost-diagonal, operators. The operator action on the wavelet series representing 
some function does not have uncontrollable sequences, i.e., wavelet decompositions are robust. 
One can describe precisely what happens to the initial series under the operator action and how 
it is transformed. In a certain sense, wavelets are stable under the operations of integration and 

8 For unconditional bases, the order in which the basis vectors are taken does not matter. All of the known 
constructions of unconditional bases of wavelets rely on the concept of the multiresolution analysis . 

9 The function / belongs to the Sobolev space W S (R) if its Fourier transform provides the finite integrals 

/^(l + M^i/Ml 2 . 
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differentiation. That is why wavelets, used as a basis set, allow us to solve differential equations 
characterized by widely different length scales found in many areas of physics and chemistry. 
Moreover, wavelets reappear as eigenfunctions of certain operators. 

To deal with operators in the wavelet basis it is convenient, as usual, to use their matrix 
representation. For a given operator T it is represented by the set of its matrix elements in the 
wavelet basis: 

Tj,k;j,'k' =< Tpj,k\ T \Tpj',k> > • (66) 

For linear homogeneous operators, their matrix representation can be explicitly calculated [TJJ . 

It is extremely important that it is sufficient to first calculate the matrix elements at some (j-th) 
resolution level. All other matrix elements can be obtained from it using the standard recurrence 
relations. Let us derive the explicit matrix elements rjj.jji of the homogeneous operator T of the 
order a: 

r j,l;j,i' =< <Pj,l\ T \<Pj,l' > ■ (67) 
Using the recurrence relations between the scaling functions at the given and finer resolution 
levels, one gets the following equation relating the matrix elements at neighboring levels: 

r j,i;j,i' =< (X/ hk(Pj+i,2i-k)\T\(^2 hk ,( Pj+i,2l'-k') >= ^2^2h k h k irj + i t 2i-k;j+i,2i'-k'- (68) 

k k> k k' 

For the operator T having the homogeneity index a one obtains 

r jt w = 2° hhk>rj,2i-k;j,2i'-k'- (69) 

k k' 

The solution of this equation defines the required coefficients up to the normalization constant 
which can be easily obtained from the results of the action by the operator T on a polynomial of 
a definite rank. For non-integer values of a, this is an infinite set of equations. 
The explicit equation for the n-th order differentiation operator is 

d n 

= (<P(x)\—\(p(x - k)) = 
^hih m (ip(2x + — \ip{2x + m- k)) = 

2"£/a m r> ) __. (70) 

i,m 

It leads to a finite system of linear equations for r k (the index n is omitted): 

2r fc = r 2 k + ^2a 2 m + i(r 2 k-2m+i + (-l) n r 2 fc+2m-i), (71) 

m 

where both r k and a m = J2i hih i+m (a = 1) are rational numbers in the case of Daubechies 
wavelets. The wavelet coefficients can be found from these equations up to a normalization 
constant. The normalization condition reads |TT| : 

Y.k n r k = n\. (72) 

k 

For the support region of the length L, the coefficients r k differ from zero for — L + 2 < k < L — 2, 
and the solution exists for L > n + 1. These coefficients possess the following symmetry properties: 

rk = r- k (73) 
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for even k, and 

rk = -r-k (74) 

for odd values of k. 

Here, as an example, we show in the Table the matrix elements of the first and second order 
differential operators in the Daubechies wavelet basis with four vanishing moments (-D 8 ). 



Tab 


e. 


K 


hu 


(u>(x)W\io(x — k)) 


(<x)(x)N 2 \lp(x - k)) 


-0 





0.00000084 


0.00001592 


K 
o 





-0.00017220 


-0.00163037 


-4 





-0.00222404 


-0.01057272 


-3 





0.03358020 


0.15097289 


-2 


-0.07576571 


-0.19199897 


-0.69786910 


-1 


-0.02963552 


0.79300950 


2.64207020 





0.49761866 





-4.16597364 


1 


0.80373875 


-0.79300950 


2.64207020 


2 


0.29785779 


0.19199897 


-0.69786910 


3 


-0.09921954 


-0.03358020 


0.15097289 


4 


-0.01260396 


0.00222404 


-0.01057272 


5 


0.03222310 


0.00017220 


-0.00163037 


6 





-0.00000084 


0.00001592 



For a continuous linear operator T represented by a singular integral 

Tf[x) = J K(x,y)f(y)dy (75) 

with some definite conditions imposed on the kernel K (see Appendix 2), there exists the important 
theorem (called T(l) theorem) which states a necessary and sufficient condition for the extension 
of T as a continuous linear operator on L 2 (R n ) (more details and the elegant wavelet proof of the 
theorem are given in Ref. [[|]). 

Let us note that a standard difficulty for any spectral method is a representation of the operator 
of multiplication by a function. As an example for physicists, we would mention the operator of 
the potential energy in the Schrodinger equation. However, it is well known that this operation 
is trivial and diagonal in the real space. Therefore for such operations one should deal with a 
real space, and after all operations done there return back. Such algorithm would need O(N) 
operations only. 

10 Nonstandard matrix multiplication 

There are two possible ways to apply operators onto functions within wavelet theory. They are 
called the standard and nonstandard matrix forms. 

For smooth enough functions most wavelet coefficients are rather small. For a wide class of 
operators, most of their matrix elements are also quite small. Let us consider the structure of 
the elements of the matrix representation of some operator T that are large enough. The matrix 
elements satisfy the following relations 

Tj,k;j',k' 0, where \k — k'\ — * oo, (76) 
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Tj,k;j>,k> -> 0, where \j - j'\ -> oo. (77) 

The topology of the distribution of these matrix elements within the matrix can be rather compli- 
cated. The goal of the nonstandard form is to replace the latter equation ( \T7\) by another, more 
rigorous one: 

'/'./,:/./• 0. if J^f. (78) 

It avoids taking the matrix elements between the different resolution levels. To deal with it, one 
should consider, instead of the wavelet space, the overfull space with the basis containing both 
wavelets and scaling functions at various resolution levels. 

Let us consider the action of the operator T on the function / which transforms it into the 
function g: 

9 = Tf. (79) 

Both g and / have the wavelet representations with the wavelet coefficients (fSj tk ', fdj, k ) and 
( g Sj 7 k] gdj,k)- At the finest resolution level j n only s-coefficients differ from zero, and the transfor- 
mation looks like 

g s in,k = J2 T ssUm k;j n , k') f Sj n>k i. (80) 
k' 

At the next level one gets in both the standard and nonstandard approaches 

g s jn~i,k = J2 T ssUn - 1, k;j n - 1, fc')/ s i„-i,fc' + J2 T SD(jn ~ 1, k;j n - 1, k') fdj n - 1;k > , (81) 
k' k' 

g d jn ^ hk = J2 T Ds(jn ~ 1, k; j n - 1, k')fs jn ^ lik > + ^ T ddUu - 1, k;j n - 1, k')fdj n -i ik ', (82) 

k' k' 

where T ss (j n , k] j n , k') = f dxifj ntk (x)Tifj ntk /(x) and the replacement of subscripts S — > D corre- 
sponds to the substitute <p — > ip in the integrals. There is coupling between all resolution levels 
because all s-coefficients at this (j n — l)-th level should be decomposed by the fast wavelet trans- 
form into s and ^-coefficients at higher levels. Therefore even for the almost diagonal initial step 
the standard matrix acquires rather complicated form as demonstrated in Fig. 11 for 4- level op- 
erations (similar to those discussed above in the case of the Haar wavelets). It is thus inefficient 
for numerical purposes. 

As we see in Fig. 11, at the final stage of the standard approach we have to deal with the wavelet 
representation corresponding to the formula fl22|) with only one s-coefficient left in the vectors 
which represents the overall weighted average of the functions (the S'S'-transition from / to g is 
given by the upper left box in the Figure). At the same time, in the process of approaching it from 
the scaling-function representation (|23|), ([[]) we had to deal with average values at intermediate 
levels decomposing them at each step into s and d parts of further levels. These intermediate 
s-coefficients have been omitted since they were replaced by s and rf-coefficients at the next levels. 
That is why the matrix of the standard approach looks so complicated. 

To simplify the form of the matrix, it was proposed |2(J to use the redundant set of the wavelet 
coefficients. Now, let us keep these average values in the form of corresponding s-coefficients of 
the intermediate levels both in initial and final vectors representing functions / and g. Surely, we 
will deal with redundant vectors which are larger than necessary for the final answer. However, 
it should not bother us because we know the algorithm to reduce the redundant vector to a 
non-redundant form. At the same time, this trick simplifies both the form of the transformation 
matrix and the computation. This non-standard form is shown in Fig. 12. Different levels have 
been completely decoupled because there are no blocks in this matrix which would couple them. 



24 



The block with SS-elements has been separated, and in place of it the zero matrix is inserted. 
The whole matrix becomes artificially enlarged. Correspondingly, the vectors, characterizing the 
functions / and g, are also enlarged. Here all intermediate s-coefficients are kept for the function 
/ (compare the vectors in the right-hand sides of Figs. 11 and 12). Each Sj + \ is generated from 
Sj and Dj. That is where the coupling of different levels is still present. In the transformation 
matrix all S'S'-elements are zero except for the lowest one SqSq. All other SD, DS, DD matrices 
are almost diagonal due to the finite support of scaling functions and wavelets. The redundant 
form of the ^-function vector of Fig. 12 may be reduced to its usual wavelet representation of Fig. 
11 by splitting up any Sj into Sj-i and -Dj-i by a standard method. Then these Sj-i and Dj-\ are 
added to the corresponding places in the vector. This iterative procedure allows, by going from 
Sj-i down to So, to get the usual wavelet series of the function g. We get rid of all s-coefficients 
apart from S . The computation becomes fast. 



11 Regularity and differentiability 

The analysis of any signal includes finding the regions of its regular and singular behavior. One of 
the main features of wavelet analysis is its capacity of doing a very precise analysis of the regularity 
properties of functions. When representing a signal by a wavelet transform, one would like to know 
if and at which conditions the corresponding series is convergent and, therefore, where the signal 
is described by the differentiable function or where the singularities appear. For certain singular 
functions, essential information is carried by a limited number of their wavelet coefficients. Such 
an information may be used in the design of numerical algorithms. We start with the traditional 
Holder conditions of regularity and, in the next Section, proceed to their generalization following 
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The Holder definition of the pointwise regularity at a point xq of a real-valued function / 
defined on R n declares that this function belongs to the space C a (x ) (a > —n) if there exists a 
polynomial P of order at most a such that 

\f{x)-P(x-x )\<C\x-x \ a , (83) 

where C is a constant. 

The supremum of all values of a such that ( |83"D holds is called the Holder exponent of / at xq 
and denoted a(x o )0- 

Correspondingly, a point x is called a strong a-singularity (— n < a < 1) of / if, at small 
intervals, the following inequality is valid 

\f(x)-f(y)\>C\x-y\ a (84) 

for a relatively large set of x's and y's close to x . 

Such definitions work quite well for a rather smooth function. However, in the case of a function 
drastically fluctuating from point to point they are difficult to handle at each point because, e.g., 
the derivative of / may have no regularity at Xq, the condition ([83]) may become unstable under 
some operators such as Hilbert transform etc. The generalized definition of pointwise regularity 



10 In principle, the expression \x — xo\ a in the right-hand side of Eq. ( p3| ) can be replaced by a more general 
function satisfying definite conditions and called a modulus of continuity but we shall use only the above definition. 
In general, a modulus of continuity defines the largest deviation from that best polynomial approximation of a 
function / which is characterized by the set of smallest deviations compared to other polynomial approximations. 
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will be given in the next Section by introducing two-microlocal spaces. Here, we concentrate on 
global properties of the function /. 

The uniform regularity of a function / at positive non-integer values of a consists in the 
requirement for the Eq. flS5| ) to hold for all real n-dimensional Xq with the uniform value of the 
constant C . At first sight, this condition looks rather trivial because for smooth functions the 
uniform regularity coincides with the pointwise regularity, which is everywhere the same. To 
understand that this is non-trivial, one could consider, e.g., the function f(x) = xsinl/x with 
pointwise exponents a(0) = 1, ol(xq) = oo for any fixed Xq different from 0, and uniform Holder 
exponent a = 1/2 which appears for the set of points x$ oc \x — xo] 1 ^ 2 3> \x — Xq\. 

The requirement of a definite uniform regularity of a n-dimensional function / on the whole 
real axis at positive non-integer values of a can be stated in terms of wavelet coefficients as an 
inequality 

\d jjk \ < C2-(f+ a «. (85) 

Thus from the scale behavior of wavelet coefficients one gets some characteristics of uniform 
regularity of the function. In particular, the linear dependence of the logarithms of wavelet 
coefficients on the scale j would indicate on the scaling properties of a signal, i.e., on the fractal 
behavior whose parameters are determined from higher moments of wavelet coefficients (see below). 
For pointwise regularity determining the local characteristics, the similar condition is discussed in 
the next Section. 

Now, let us formulate the conditions under which wavelet series converge at some points, or 
are differentiable. It has been proven that 

• if / is square integrable, the wavelet series of / converges almost everywhere; 

• the wavelet series of / is convergent on a set of points equipotent to R if 

\d jik \ = 2~ j %, (86) 
where rjj tends to when j tends to +oo; 

• the function / is almost everywhere differentiable and the derivative of the wavelet series of 
/ converges almost everywhere to /', if the following condition is satisfied 

£Kfc| 2 2 2j '<oo; (87) 

• the function / is differentiable on a set of points equipotent to R and, at these points the 
derivative of its wavelet series converges to /', if 

\dj, k \ < 2-^\, (88) 

where rjj tends to when j tends to +oo. 

Limitations imposed on wavelet coefficients are generalized in the next Section to include the 
pointwise regularity condition when two-microlocal spaces are considered. These better estimates 
do not hold just uniformly, but hold locally (possibly, apart from a set of points of small Hausdorff 
dimension which can be determined by wavelet analysis ). 

Note finally that one can also derive global regularity of / from the decay in uo of the absolute 
value of its windowed Fourier transform, if the window function g is chosen to be sufficiently 
smooth. In most cases, however, the value of the uniform Holder exponent computed from the 
Fourier coefficients will not be optimal. Nothing can be said from Fourier analysis about the local 
regularity, in contrast to two-microlocal wavelet analysis discussed below. 
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12 Two-microlocal analysis 



The goal of the two-microlocal analysis is to reveal the pointwise behavior of any function from 
the properties of its wavelet coefficients. The local regularity of the function is thus established. 

The scalar product of an analyzed one- dimensional function and a wavelet maps this function 
into two-dimensional wavelet coefficients which reveal both scale and location properties of a 
signal. In the above definitions of the pointwise regularity of the function at xq determined by the 
Holder exponent only local but not scaling properties of the signal are taken into account. The 
problem of determining the exact degree of the Holder regularity of a function is easily solved if 
this regularity is everywhere the same, because in such a case it usually coincides with the uniform 
regularity. The determination of the pointwise Holder regularity, however, becomes much harder 
if the function changes wildly from point to point, i.e., its scale (frequency) characteristics depend 
strongly on the location (time). In that case we have to deal with a very non-stationary signal. 
To describe the singularity of f(x) at xq only by the Holder exponent a(xo) < 1 one looks for 
the order of magnitude of the difference \f(x) — f(xo)\ when x tends to xo, without taking into 
account, e.g., the possible high-frequency oscillations of f(x) — f(x ), i.e., its scale behavior. 

The properties of the wavelet coefficients as functions of both scale and location^ provide a 
unique way of describing the pointwise regularities. It is more general than the traditional Holder 
approach because it allows us to investigate, characterize and easily distinguish some specific local 
behaviors such as approximate selfsimilarities and very strong oscillatory features like those of 
the indefinitely oscillating functions which are closely related to the so-called "chirps" of the form 
x a sin(l/a;^) (reminding of bird, bat or dolphin sonar signals with very sharp oscillations which 
accelerate at some point xq)- Chirps are well known to everybody dealing with modern radar and 
sonar technology. In physics, they are known to the authors, e.g., in theoretical considerations 
of the dark matter radiation and of the gravitational waves. Sometimes, similar dependence can 
reveal itself even in the more traditional correlation analysis f22 , 23 , 24 1. The large value of a 



second derivative of the phase function of a frequency modulated signal is their typical feature. 



Such special behavior with frequency modulation laws hidden in a given signal can be revealed [21 
with the help of the two-microlocal analysis . Actually, there is no universally accepted definition 
of a chirp. Sometimes, any sine of a non-linear polynomial function of time is also called a chirp. 

The two-microlocal space C s,s (xo) of the real- valued n-dimensional functions / (distributions) 
is defined by the following simple decay condition on their wavelet coefficients dj^ 

\d jtk (x )\ < C2-(f + ^(l + \Vx - k\y s ', (89) 

where s and s' are two real numbers. This is a very important extension of the Holder conditions. 
The two-microlocal condition is a local counterpart of the uniform condition (jS5| ). It is very closely 
related to the pointwise regularity condition (jS3|) because it expresses, e.g., the singular behavior 
of the function itself at the point Xq in terms of the fc-dependence of its wavelet coefficients at the 
same point. Such an estimate is stable under derivation and fractional integration. 
For s' = 0, the space thus defined is the global Holder space C s (R n ). 

If s' > 0, these conditions are weaker at xq than far from xq so that they describe the behavior 
of a function which is irregular at xq in a smooth enviroment, i.e., the regularity of / gets worse 
as x tends to xq. If s' is positive, the inequality ( jS9"D implies that \dj t k\ < C2~^+ s W j so that / 
belongs to C s (R n ). The Holder norms do not give any information about this singularity because 

n Let us note that the word "location" (and, correspondingly, "scale") does not necessarily imply a single di- 
mension but could describe the position (and size or frequency) in any n-dimensional space. 
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one has to consider them on domains that exclude xq, and the Holder regularity of / deteriorates 
when x gets close to xq. That is why we need a uniform regularity assumption when s' is positive. 

If s' < 0, the converse situation takes place when xq is a point of regularity in a nonsmooth 
enviroment. 

Thus the inequality Eq. (|89| ) clarifies the relationship between the pointwise behavior of a 
function and the size properties of its wavelet coefficients. If s > and s' > s, then C s,s (x ) 
is included in C s (x ). One can prove that the elements of the space C s ' s (x ), for s' > —s, are 
functions whereas the elements of C s '~ s (xo) are (in general) not functions but distributions (and, 
moreover, quite "wild distributions" for which the local Holder condition (|83|) cannot hold and 
should be generalized by multiplying the right-hand side by the factor logl/|x — xo\). However, 
if / belongs to the space C s (xq) then it belongs to C s ~ s (xo) too. In particular, if at x close to xq 
the following inequality |/(x)| < C\x — x$\ s is valid for s < 0, imposing some limit on the nature 
of the singularity at this point, then the estimates for wavelet coefficients are given by 

\djjk\ < C2-^ +s ^\k - yx \ s , (90) 

if the support of ipj,k is at least at a distance 2~i from x , and 

K*| < C2~^ + ^, (91) 

if the support of ipj^ is at a distance less than 2~ J from xq, which demonstrate the above statement. 

The two-microlocal spaces have some stability properties. In particular, the pseudodifferential 
operators of order are continuous on these spaces (in contrast with the usual pointwise Holder 
regularity condition which is not preserved under the action of these operators, for example, of 
the Hilbert transform). The position of the points of regularity of a function / which belongs to 
the space C s,s (xq) is essentially preserved under the action of singular integral operators such as 
the Hilbert transform. This property leads to a pointwise regularity result for solutions of partial 
differential equations which is formulated [^TJ by the following 



Theorem. 

Let A be a partial differential operator of order m, with smooth coefficients and elliptic at Xq. If 
Af = g and g belongs to C s ' s '(xo), then / belongs to C s+m ' s ' (x ). 

Thus, if Af = g, and if g is a function that belongs to C 3 (xo), then there exists a polynomial 
P of degree less than s + m such that, for \x — x \ < 1, 

\f(x)-P(x)\ < C\x-x \ s+m . 

All the above conditions look as if the wavelets were eigenvectors of the differential operators 
gP, |a| < r, with corresponding eigenvalues 2^ a K 

The more general functional spaces which admit the power dependence of wavelet coefficients 
on the scale j defined by an extra parameter p are considered in Ref. |2"I |. In particular, those 



are the Sobolev spaces where the integrability is required up to some power of both the function 
itself and its derivatives up to the definite order. In this case, the additional factor j 2 l p appears 
in the right-hand side of the inequality (|89~D, or, in other words, in addition to the linear term 
in the exponent one should consider another logarithmically dependent term. Thus there is no 
rigorous selfsimilarity (fractality) of the function any more. However, the wavelet analysis allows 
us to determine the fractal dimensions of the sets of points where the function / is singular. 
For continuous wavelets, the condition equivalent to Eq. (|5P|) takes the form 



\W{a,b)\ < Ca s (l + J ^)~ s '. (92) 
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The definite upper limits on the behavior of | W(a, 6) | can also be obtained if / is integrable in 
the neighborhood of the origin (see pip . They are somewhat complicated, and we do not show 
them here. 

It is instructive to examine the meaning of the conditions of the type (|92|). Let us consider 
the cone in the (b, a) half-plane defined by the condition \b — Xq\ < a. Within this cone we have 
|W(a, 6)| = 0(a s ) as a — > 0. Outside the cone, the behavior is governed by the distance of 6 to 
the point Xq. These two behaviors are generally different and have to be studied independently. 
However, it is shown in that non-oscillating singularities may be characterized by the behavior 



of the absolute values of their wavelet coefficients within the cone. It is also shown in |25j that 
rapidly oscillating singularities, which we consider below, cannot be characterized by the behavior 
of their wavelet transform in the cone. 

The two-microlocal methods proved especially fruitful in the analysis of oscillating trigono- 
metric and logarithmic chirps. 

A simplest definition of trigonometric chirps at the origin x = could simply be read 

f(x)=x a g(x- p ), (93) 

where g is a 27r-periodic C r function with vanishing integral. This type of behavior leads to the 
following expansion of continuous wavelet coefficients at small positive b < 6 on curves a = Xb 1+/3 

W{\b 1+f3 ,b) = b a m x {b~ 13 ), (94) 

where m\ is a 27r-periodic function with a definite norm. It shows that rapidly oscillating singu- 
larities cannot be characterized by the behavior of their wavelet transform in the cone as in the 
previous examples. In this case, the wavelet coefficients should be carefully analyzed not in the 
cone but on definite curves because they are concentrated along these ridges. For small enough 
scales, such a ridge lies outside the influence cone of the singularity. A typical example is given 
by the function f(x) = sinl/x whose instantaneous frequency tends to infinity as x —>■ 0. The 
wavelet coefficient modulus is maximum on a curve of equation b = Ca 2 for some constant C 
depending only on the wavelet and this curve is not a cone. Therefore it is not sufficient to 
study the behavior of wavelet coefficients inside the cone to characterize the rapidly oscillating 
singularities. Let us note, that in case of noisy signals, the contribution of the deterministic part 
of the signal may be expected to be much larger than that of a noise just near the ridges of the 
wavelet transform because the wavelet transform of the noise is spread in the whole (j, fc)-plane. 
It can be thus used for denoising the signal. 

Logarithmic chirps have an approximate scaling invariance near the point xq- A function / 
has a logarithmic chirp of order (a, A) and of regularity 7 > at the origin x = if for x > 
there exists a log A-periodic function G(logx) in C 7 (i?) such that 

f(x) - P{x) = x a G(\og x). (95) 

The continuous wavelet coefficients of it satisfy the condition 

W(a,b) = a a H(\oga,b/a), (96) 

where H is log A-periodic in the first argument and its behavior in b/a is restricted by some decay 
conditions. In other words, if the following scaling property is observed 

C(Xa,Xb) = X a C{a,b) (97) 
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for A < 1 and small enough a and b, then / has a logarithmic chirp of order (a, A). More general 
and rigorous statements and proofs can be found in Ref. f2~If . Thus we conclude that the type 



and behavior of a chirp can be determined from the behavior of its wavelet coefficients. 

As an example of the application of the above statements, let us mention the continuous 
periodic function a(x) represented by the Riemann series 



OO 

a(x) = — sin nn 2 x, (9£ 
n z 



i 

which belongs to the space C 1 ^ 2 . It is best analyzed by a continuous wavelet transform using the 
specific complex wavelet if>(x) = (x + i)~ 2 proposed by Lusin in 1930s in the functional analysis. 
Then the wavelet transform of the Riemann series is given by the well known Jacobi Theta function. 
The behavior of wavelet coefficients determines the chirp type and its parameters. With the help 
of the two-microlocal analysis it was proven that this function has trigonometric chirps at some 
rational values of x = xq which are ratios of two odd numbers (also, its first derivative exists 
only at these points) and logarithmic chirps at quadratic irrational numbers (near these points its 
wavelet transform possesses the scaling invariance properties). 

For practical purposes, however, it may happen that very large values of j are needed to 
determine Holder exponents reliably from the above conditions. For the global Holder exponent, 
no assumptions about regularity of wavelets is required whereas determination of the local Holder 
exponents requires a more detailed approach. 

A nice illustration of application of wavelet analysis for studies of local regularity properties 
of the function f(x) is given in [Q] for the following function 

f(x) = 2e-^6(-x - 1) + e- lxl 6{x + 1)6{1 - x) + e' x [(x - l) 2 + l\0(x - 1), (99) 

which is infinitely differentiable everywhere except at x — —1,0,1 where, respectively, f,f',f" 
are discontinuous. One computes for each of the three points the maxima of wavelet coefficients 
Aj = maXk\Wj^\ at various resolution levels 3 < j < 10 and plots log A, / log 2 versus j. The 
linear dependences on j are found at these points. The slopes at x — — 1, 0, 1 are, correspondingly, 
-0.505 ; -1.495 ; -2.462, leading with pretty good accuracy of 1.5% to rather precise estimates of 
the Holder exponents 0, 1, 2. 

Even better accuracy in determining the local regularity of a function can be achieved with the 
help of redundant wavelet families (frames) where the translational non-invariance is much less 
pronounced, the wavelet regularity plays no role and only the number of its vanishing moments is 
important. 

If orthonormal wavelets are used, then their regularity can become essential. Typically, they 
are continuous, have a non-integer uniform Holder exponent and different local exponents at 
different points. In fact, there exists a whole hierarchy of (fractal) sets in which these wavelets 
have different Holder exponents. There is a direct relation between the regularity of the function 
ip and the number of its vanishing moments. The more moments are vanishing, the more smooth 
is the function ijj. For example, for widely used Daubechies wavelets with a compact support the 
degree of regularity increases linearly with the maximum number of vanishing moments N for 
higher-tap wavelets as fiN with /i = 0.2 and, correspondingly, with the support width, as was 
already mentioned in Section 4. These properties justify the name "mathematical microscope", 
which is sometimes bestowed on the wavelet transform. Let us note that the choice of the hk 
that leads to maximal regularity is, however, different from the choice with maximal number of 
vanishing moments for ip. How well the projections of a multiresolution approximation Vj converge 
to a given function / depends on the regularity of the functions in Vq. 
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There exists a special class of wavelets called vaguelets. They possess the regularity properties 
characterized by the following restrictions: 

\i>j,k( x )\ < C2 n ^ 2 {l + \2 j x - k\)- n ~ a , (100) 
fe(z') - ^jjk{x)\ < C2^ n ' 2+ ^\x' - af (101) 



with < (3 < a < 1 and a constant C . Surely, the cancellation condition (|2T ) must be also 



satisfied. The norm of any function / is then limited by its wavelet coefficients as 

imi<c'(EK>l 2 ) 1/2 (102) 

with a constant C . Any continuous linear operator T on L 2 which satisfies the condition T(l) = 
transforms an orthonormal wavelet basis into vaguelets. 

In practice, the regularity of a wavelet can become especially important during the synthesis 
when after omission of small wavelet coefficients it is better to deal with a rather smooth if> to 
diminish possible mistakes at the restoration stage. On the contrary, for analysis it seems to 
be more important to have wavelets with many vanishing moments to "throw out" the smooth 
polynomial trends and reveal potential singularities. Also, the large number of vanishing moments 
leads to better compression but can enlarge mistakes at inverse procedure of reconstruction because 
of worsened regularity of wavelets. The use of the biorthogonal wavelets helps a lot at this stage 
because among two dual wavelets one has many vanishing moments whereas another possesses 
good regularity properties. By choosing the appropriate inversion formula one can minimize 
possible mistakes. 



13 Wavelets and fractals 



Some signals (objects) possess the self-similar (fractal) properties (see, e.g., [^, |27|, ^8], ^J|). 
It means that by changing the scale one observes at a new scale the features similar to those 
previously noticed at other scales. This property leads to power-like dependences. The formal 
definition of a (mono)fractal Hausdorff dimension Dp of a geometrical object is given by a condition 

< lim e _^ N{e)e DF < oo, (103) 

which states that Dp is the only value for which the product of the minimal number of the (covering 
this object) hypercubes N(e) with a linear size I = e and of the factor e Dp stays constant for e 
tending to zero. In the common school geometry of homogeneous objects, it coincides with the 
topological dimension. The probability Pi(e) to belong to a hypercube iVj(e) is proportional to 
e Dp , and the sum of moments is given by 

]Tp?(e) ocA (104) 

i 

The fractal dimension is directly related to the Holder exponents. 

Moreover, for more general objects called multifractals, the "fractal exponents" a(xo) (see 
Eq. (pq)) vary from point to point. The Hausdorff dimension of the set of points Xo, where 
a(xo) = «o ; is a function of d(ao) whose graph determines multifractal properties of a signal, let 
it be Brownian motion, a fully developed turbulence or a purely mathematical construction of 
the Riemann series. Thus the weights of various fractal dimensions inside a multifractal differ for 
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different multifractals, and the value Dp is now replaced by D q+ \ which depends on q. It is called 
the Renyi (or generalized) dimension | 30|| . Usually, it is a decreasing function of q. The fractal 



(Hausdorff), information and correlation dimensions are, correspondingly, obtained from the Renyi 
dimension at q = — 1; 0; 1. The difference between the topological and Renyi dimensions is called 
the anomalous dimension (or codimension). Fractals and multifractals are common among the 
purely mathematical constructions (the Cantor set, the Serpinsky carpet etc) and in the nature 
(clouds, lungs etc). 

The pointwise Holder exponents are now determined using wavelet analysis . As we have seen, 
all wavelets of a given family ipj^x) are similar to the basic wavelet i[)(x) and derived from it by 
dilations and translations. Since wavelet analysis just consists in studying the signal at various 
scales by calculating the scalar product of the analyzing wavelet and the signal explored, it is 
well suited to reveal the fractal peculiarities. In terms of wavelet coefficients it implies that their 
higher moments behave in a power-like manner with the scale changing. The wavelet coefficients 
are less sensitive to noise because they measure, at different scales, the average fluctuations of the 
signal. 

Namely, let us consider the sum Z q of the g-th moments of the coefficients of the wavelet 
transform at various scales j 

Z q (j) = j:\dj,k\ q , (105) 



where the sum is over the maxima of \dj^\- Then it was shown [[Jl], |32| that for a fractal signal 
this sum should behave as 

Z q {j) oc 2 j[T{q)+ i\ (106) 

i.e., 

\ogZ q (j)ocj[r(q)+^}. (107) 

Thus the necessary condition for a signal to possess fractal properties is the linear dependence of 
log Z q (j) on the level number j. If this requirement is fulfilled the dependence of r on q shows 
whether the signal is monofractal or multifractal. Monofractal signals are characterized by a 
single dimension and, therefore, by a linear dependence of r on q, whereas multifractal ones are 
described by a set of such dimensions, i.e., by non-linear functions r(q). Monofractal signals are 
homogeneous, in the sense that they have the same scaling properties throughout the entire signal. 
Multifractal signals, on the other hand, can be decomposed into many subsets characterized by 
different local dimensions, quantified by a weight function. The wavelet transform removes lowest 
polynomial trends that could cause the traditional box-counting techniques to fail in quantifying 
the local scaling of the signal. The function r(q) can be considered as a scale-independent measure 
of the fractal signal. It can be further related to the Renyi dimensions, Hurst and Holder (at 
q = 1 as is clear from the examples in the previous Sections) exponents (for more detail, see 



Refs. p3\> |34|1). The range of validity of the multifractal formalism for functions can be elucidated 
35| with the help of the two-microlocal methods generalized to the higher moments of wavelet 
coefficients. Thus, wavelet analysis goes very far beyond the limits of the traditional analysis which 
uses the language of correlation functions (see, e.g., |3"3]| ) in approaching much deeper correlation 
levels. 

Let us note that Z 2 (j) is just the dispersion (variance) of wavelet coefficients whose average 
is equal to zero. For positive values of q, Z q {j) reflects the scaling of the large fluctuations and 
strong singularities, whereas for negative q it reflects the scaling of the small fluctuations and 
weak singularities, thus revealing different aspects of underlying dynamics. 
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14 Discretization and stability 



In signal analysis , real-life applications produce only sequences of numbers due to the discretiza- 
tion of continuous time signals. This procedure is called the sampling of analog signals. Below we 
consider its implications. At first sight, it seems that in this case the notions of singularities and 
Holder exponents are meaningless. Nevertheless, one can say that the behavior of wavelet coeffi- 
cients across scales provides a good way of describing the regularity of functions whose samples 
coincide with the observations at a given resolution. 

First, we treat the doubly infinite sequence /w = {/„} = {f(nAt)} for — oo < n < oo 
obtained by sampling a continuous (analog) signal at the regularly spaced values t n = nAt. If 
within each n-th interval At the function / can be replaced by the constant value f(nAt) then 
for small enough At one gets 

/oo 00 
f(t)e~ iut dt « At E f(nAt)e~ inujAt . (108) 
-°° n=-oo 

The inversion formula reads 

U = TT fJ™ M dw. (109) 

Z7T J-00 

The function f u is periodic with the period 2^/At. It means that one can consider it only 
within the frequency interval [— it/ At, ir/ At). Moreover, for real signals even the interval [0, ir/ At) 
is sufficient. For time-limited signals the summation in Eq. ( [L08|) is done from n min = to 
n max = N — 1 for N sampled values of a signal. For fast Fourier transform algorithms, it asks for 
O(NlogN) computations. 

To get the relation between Fourier transforms of the continuous time signal / and of the 
sequence of samples we rewrite the formula ( |109| ) 

1 ~ f (2k+l)n/At . 

fn = - E / Ue™ At du = 

27T J(2k-l)lv/At 



1 /-vr/At 00 1 rir/At 

7T- / E W/ W ^ = 7T / fL d) e inwAt du, (110) 

Z7T J-Tv/At u^Z ZVT J -it/ At 

wherefrom one gets 

00 

/i.^ = E f^+2kn/At- (HI) 
k=— 00 

Thus the Fourier transform of the discrete sample contains, in general, the contributions from the 
Fourier transform of the continuous signal not only at the same frequency uj but also at countably 
infinite set of frequencies uj + 2kir / At. These frequencies are called aliases of the frequency uj. 
For " undersampled" sets, the aliasing phenomenon appears, i.e., the admixture of high frequency 
components to lower frequencies. The frequency u)(n) = n / At is called the Nyquist frequency. 
To improve the convergence of the series, the " oversampling" is used, i.e., / is sampled at a rate 
exceeding the Nyquist rate. 

The band-limited function / can be recovered from its samples / n 's whenever the sampling 
frequency uj s oc (At) -1 is not smaller than the band-limit frequency ujf, i.e., whenever uj s > ujf. 
This statement is known under the name of the sampling (or Shannon-Kotelnikov) theorem. 
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In this case, there is no aliasing since only one aliased frequency is in the band limits [—u)f, Wf\. 
If u s < Uf, then the function / cannot be recovered without additional assumptions. 

The rigorous proof of the theorem exists, but it is quite clear already from intuitive arguments 
that one cannot restore the high-frequency content of a signal by sampling it with lower frequency. 
Therefore, to get knowledge of high-frequency components one should sample a signal with fre- 
quency exceeding all frequencies important for a given physical process. Only then the restoration 
of a signal is stable. One can reduce the sampling frequency uj s all the way down to Uf without 
losing any information. It allows to subsample the signal by keeping only smaller sets of data, i.e., 
to get a shorter sampled signal. 

In case of wavelet analysis , to have a numerically stable reconstruction algorithm for / from 
dj t k one should be sure that ipj^ constitute a frame. For better convergence, one needs frames 
close to tight frames, i.e., those satisfying the condition (| - 1) < 1. The orthonormal wavelet 
bases have good time-frequency localization. In principle, if i[) itself is well localized both in time 
and in frequency, then the frame generated by if) will share that property as well. 

Discrete wavelets are quite well suited for the description of functions by their mean values at 
the equally spaced points. However in practical real-life applications, apart from this projection 
of a function, one has to deal with a finite interval of its definition and with a finite number 
of resolution levels. As was mentioned in Section 5, one usually rescales the "units" of levels 
by assuming that the label of the finest available scale is j = 0, and the coarser scales have 
positive labels 1 < j ' < J. The fine-level coefficients are defined by the sampled values of f(x) as 
so,k = f(k), i.e., instead of the function f(x) one considers its projection P f. The higher-level 
coefficients are found from the iterative relations (flop , (ffTD, i.e., with the help of fast wavelet 
transform without direct calculation of integrals / f(x)ipj^(x)dx and / f(x)ipj ! k(x)dx. Therefore 
the approximate representation of the function f{x) which corresponds to the redefined versions 



of Eqs. fl5|) and ( j22"|) can be written as 



f{x) « P f = d i,k^j,k + s J,k¥J,k, (112) 

j=l k k 

where the sum over k is limited by the interval in which f(x) is defined. Moreover, to save the 
computing time, one can use not a complete set of wavelet coefficients d^ k but only a part of 
them omitting small coefficients not exceeding some threshold value e. This standard estimation 
method is called estimation by coefficient thresholding. If the sum in ( |112| ) is taken only over such 
coefficients \djj\ > e and the number of the omitted coefficients is equal to n , then the function 
f e which approximates f(x) in this situation will differ from f(x) by the norm as 

||/(*)-/ e (*)||<er^ 2 . (113) 

Thus for a function, which is quite smooth in the main domain of its definition and changes 
drastically only in very small regions, the threshold value e can be extremely small. Then it admits 
large number of wavelet coefficients to be omitted with rather low errors in final approximations. 
Instead of this so-called hard- shrinkage procedure, one can use a soft-shrinkage thresholding |36[ 



which shifts positive and negative coefficients to their common origin after the omission procedure, 
i.e., replaces non-omitted dj t k in the formula ( |1 1 2| ) by 

dfl = sign{dj,k){\dj,k\ - e). (114) 
It has been proven that such an approach leads to optimal min-max estimators. 
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One can find the coefficients of the expansion (|112 ) with use of the fast wavelet transform 
since the coefficients so,/c are fixed as the discrete values of f(x). In iterative schemes, the error 
will however accumulate and their precision will not be sufficiently high. Much better accuracy 
can be achieved if the interpolation wavelets are used. In this case, the values of the function 
on the homogeneous grid f(k) are treated as s-coefficients for the interpolation basis, and initial 
values of So,fc are formed by some linear combinations of them with coefficients which depend on 
the shapes of wavelets considered. 

The more elaborate procedure of weighting different wavelet coefficients before their omission 
called quantization is often used instead of this simplified procedure of the coefficient thresholding 
(see also Section 15.4). According to experts estimates, different weights (importance) are ascribed 
to different coefficients from the very beginning. These weights depend on the final goals of 
compression. Only then the thresholding is applied. Many orthogonal wavelet bases are infinitely 
more robust with respect to this quantization procedure than the Fourier trigonometric basis. 
Inspite of this, the lack of symmetry for Daubechies wavelets with compact support does not 
always satisfy the experts because some visible defects appear. This problem can be cured when 
one uses symmetric biorthogonal wavelets having compact support. 



15 Some applications 

In this section we describe mostly those applications of wavelets which are closest to our personal 
interests (the brief summary is given in Ref. [j3T|). Even among them we have to choose those 
where, in our opinion, the use of wavelets was crucial for obtaining principally new information 
unavailable with other methods (see Web site www.awavelet.com). 

15.1 Physics 

Physics applications of wavelets are so numerous that it is impossible to describe even the most 



important of them here in detail (see, e.g., || [11], |38], |39|). They are used both in purely theoretical 



studies in functional calculus, renormalization in gauge theories, conformal field theories, nonlinear 
chaoticity, and in more practical fields like quasicrystals, metheorology, acoustics, seismology, 
nonlinear dynamics of accelerators, turbulence, structure of surfaces, cosmic ray jets, solar wind, 
galactic structure, cosmological density fluctuations, dark matter, gravitational waves etc. This 
list can easily be made longer. We discuss here two problems related to use of wavelets for solving 
the differential equations with an aim to get the electronic structure of an extremely complicated 
system, and for pattern recognition in multiparticle production processes in high energy collisions. 

15.1.1 Solid state and molecules 

The exact solution of a many-body problem is impossible, and one is applying some approximate 
methods for solid state problems, e.g., the density functional theory f40f . However, electronic 



spectra of complex atomic systems are so complicated that, even within this approach, it is im- 
possible, in practice, to decipher them by commonly used methods. For example, it would require 
to calculate about 2 100 Fourier coefficients to represent the effective potential of the uranium atom 
(and even more, in case of uranium dimers). This is clearly an unrealistic task. Application of 
the wavelet analysis methods makes it possible to resolve this problem |4l], |i~7fl . The potential 



of the uranium dimer is extremely singular. It varies by more than 10 orders of magnitude. Its 
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reconstruction has now become a realistic problem. Quite high precision has been achieved with 
the help of wavelets as seen in Fig. 13. 

The solution of the density functional theory equations by the wavelet methods was obtained 
by several groups [|2|, [|3|, (25], fijj. The method described in Refs |45|, f|6| has been applied 



to a large variety of different materials. It possesses good convergence properties as is shown 
in Figs. 14 and 15. They demonstrate how fast is the decrease of energy (or wave function) 
recipies with respect to the number of iterations (Fig. 14) or of the absolute values of the wavelet 
coefficients (Fig. 15) with respect to their index (when the coefficients are ordered according to 
their magnitude). Such a fast decrease allows us to consider much more complicated systems than 
it was possible before. This method has been successfully tested for solid hydrogen crystals at 
high pressure, manganate and hydrogen clusters [||^, fUj , 3d-metals and their clusters etc. The 
density matrix can be effectively represented with the help of Daubechies wavelets [flTfl. 



15.1.2 Multiparticle production processes 

First attempts to use wavelet analysis in multiparticle production go back to P. Carruthers 



49, B(J who used wavelets for diagonalisation of covariance matrices of some simplified cascade 



models. The proposals of correlation studies in high multiplicity events with the help of wavelets 



were promoted |51| , |52fl , and also, in particular, for special correlations typical for the disoriented 
chiral condensate [53[ It was recognized |55[] that wavelet analysis can be used for pattern 
recognition in individual high multiplicity events observed in experiment. 

High energy collisions of elementary particles result in production of many new particles in 
a single event. Each newly created particle is depicted kinematically by its momentum vector, 
i.e., by a dot in the three-dimensional phase space. Different patterns formed by these dots in 
the phase space would correspond to different dynamics. To understand this dynamics is a main 
goal of all studies done at accelerators and in cosmic rays. Especially intriguing is a problem of 
the quark-gluon plasma, the state of matter with deconfined quarks and gluons which could exist 
during an extremely short time intervals. One hopes to create it in collisions of high energy nuclei. 
Nowadays, the data about Pb-Pb collisions are available where, in a single event, more than 1000 
charged particles are produced. We are waiting for RHIC accelerator in Brookhaven and LHC in 
CERN to provide events with up to 20000 new particles created. However we do not know yet 
which patterns will be drawn by the nature in individual events. Therefore the problem of phase 
space pattern recognition in an event-by-event analysis becomes meaningful. 

It is believed that the detailed characterization of each collision event could reveal the rare 
new phenomena, and it will be statistically reliable due to a large number of particles produced 
in a single event. 

When individual events are imaged visually, the human eye has a tendency to observe different 
kinds of intricate patterns with dense clusters and rarefied voids. Combined with search for 
maxima (spikes) in pseudorapidity (polar angle) distributions, it has lead to indications on the 



so-called ring-like events in cosmic ray studies 0, [5g, [Kj and at accelerators |6TJ, |62 
However, the observed effects are often dominated by statistical fluctuations. The method of 
factorial moments was proposed to remove the statistical background but it is hard to apply 



in event-by-event approach. The wavelet analysis avoids smooth polynomial trends and underlines 
the fluctuation patterns. By choosing the strongest fluctuations, one hopes to get those which 
exceed the statistical component. The wavelet transform of the pseudorapidity spectra of cosmic 



ray high multiplicity events of JACEE collaboration was done in Ref. [51 



2 The pseudorapidity is defined as rj ~ — logtan(6>/2), where 6 is a polar angle of particle emission. 
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In Ref. wavelets were first applied to analyze patterns formed in the phase space of the 
accelerator data on individual high multiplicity events of Pb-Pb interaction at energy 158 GeV 
per nucleon. With emulsion technique used in experiment the angles of emission of particles 
are measured only, and the two-dimensional (polar+azimuthal amgles) phase space is considered 
therefore. The experimental statistics is rather low but acceptance is high and homogeneous that is 
important for the proper pattern recognition. To simplify the analysis , the two-dimensional target 
diagram representing the polar and azimuthal angles of created charged particles was split into 
24 azimuthal angle sectors of 7r/12 and in each of them particles were projected onto the polar 
angle 8 axis. Thus one-dimensional functions of the polar angle (pseudorapidity) distribution 
of these particles in 24 sectors were obtained. Then the wavelet coefficients were calculated 
in all of these sectors and connected together afterwards (the continuous MHAT wavelet was 
used). The resulting pattern showed that many particles are concentrated close to some value 
of the polar angle, i.e., reveal the ring-like structure in the target diagram. The interest to such 
patterns is related to the fact that they can result from the so-called gluon Cherenkov radiation 
54], BH] or, more generally, from the gluon bremsstrahlung at a finite length within quark-gluon 



medium (plasma, in particular). More elaborate two-dimensional analysis with Daubechies (-D 8 ) 
wavelets was done recently pBj and confirmed these conclusions with jet regions tending to lie 
on some ring-like formations. It is seen, e.g. in Fig. 16 where dark regions correspond to large 
wavelet coefficients of the large-scale particle fluctuations in two of the events analyzed. The 
resolution levels 6 < j < 10 were left only after the event analysis was done to store the long- 
range correlations in the events and get rid of short-range ones and background noise. Then the 
inverse restoration was done to get the event images with these dynamic correlations left only, and 
this is what is seen in Fig. 16. It directly demonstrates that large-scale correlations chosen have 
a ring-like (ridge) pattern. With larger statistics, one will be able to say if they correspond to 
theoretical expectations. However preliminary results favor positive conclusions ||66|| . It is due to 
the two-dimensional wavelet analysis that for the first time the fluctuation structure of an event 
is shown in a way similar to the target diagram representation of events on the two-dimensional 
plot. 

Previously, some attempts |56|, |67|, |68| to consider such events with different methods of treating 
the traditional projection and correlation measures revealed just that such substructures lead to 
spikes in the polar angle (pseudorapidity) distribution and are somewhat jetty. Various Monte 
Carlo simulations of the process were compared to the data and failed to describe this jettiness in its 
full strength. More careful analysis [7(J of large statistics data on hadron-hadron interactions 
(unfortunately, however, for rather low multiplicity) with dense groups of particles separated 
showed some "anomaly" in the angular distribution of these groups awaited from the theoretical 
side. Further analysis using the results of wavelet transform is needed when many high multiplicity 
events become available. The more detailed review of this topic is given in Ref. ||71|| . 



15.2 Aviation (engines) 

Multiresolution wavelet analysis has proved to be an extremely useful mathematical method for 
analyzing complicated physical signals at various scales and definite locations. It is tempting to 
begin with analysis of signals depending on a single variable^. The time variation of the pressure 
in an aircraft compressor is one of such signals. The aim of the analysis of this signal is motivated 

13 The trick of reducing the two-dimensional function to a set of one-dimensional ones was described in the 
previous subsection devoted to particle production. 
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by the desire to find the precursors of a very dangerous effect (stall+surge) in engines leading to 
their destruction. 

An axial multistage compressor is susceptible to the formation of a rotating stall which may 
be precipitated by a distorted inlet flow induced by abrupt manoeuvres of an aircraft (helicopter) 
or flight turbulence. The aerodynamic instability behind the rotating blades is at the origin of 
this effect. The instability regions called stall cells separate from the blades and rotate with a 
speed about 60% of the rotor speed. Thus they are crossed by the blades approximately one 
time every 1.6 rotor revolutions. It limits abruptly the flow and thus the pressure delivery to the 
downstream plenum (the combustion chamber) where the fuel is burnt up and gas jet is formed. 
The high prestall pressure build-up existing in the combustion chamber tends to push the flow 
backwards through the compressor. If the flow is reversed one calls it "deep surge". In many 
cases it is attested by flames emerging from the engine inlet. It can have serious consequences 
for the engine life and operation, not to say about an aircraft and its passengers if it happens at 
in-flight conditions. That is why the search for precursors of these dangerous phenomena is very 
important. Attempts to predict the development of a rotating stall and a subsequent surge with 
a velocity measuring probe such as a hot wire anemometer ]72|| provide a precursor or warning of 



only about 10 msec what is not enough for performing any operations which would preclude surge 
development. 

Multiresolution wavelet analysis of pressure variations in a gas turbine compressor reveals much 



earlier precursors of stall and surge processes [73]. Signals from 8 pressure sensors positioned at 



various places with the compressor were recorded and digitized at 3 different modes of its operation 
(76%, 81%, 100% of nominal rotation speed) in stationary conditions with interval 1 msec during 
5-6 sec before the stall and the surge developed. An instability was induced by a slow injection 
of extra air into the compressor inlet. After a few minutes, this led to a fully blown instability 
in the compressor. The interval of last 5-6 sec was wavelet-analyzed. Since the signal fluctuates 
in time, so too does the sequence of wavelet coefficients at any given scale. A natural measure 
for this variability is the standard deviation (dispersion) of wavelet coefficients as a function of 
scale. A scale of j — 5 showed the remarkable drop about 40% of the dispersion (variance) of 
the wavelet coefficients for more than 1 sec prior the malfunction develops. The dispersion is 
calculated according to the standard expression 



a(j,M) 
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where M is the number of wavelet coefficients at the scale j within some time interval which was 
chosen equal to 1 sec. 

In Fig. 17 the time variation of the pressure in the compressor at one of the indicated above 
modes of engine operation is shown by a very irregular line. Large fluctuations of the pressure 
at the right-hand side denote the surge onset. It is hard to guess about any warning of this 
drastic instability from the shape of this curve. The wavelet coefficients of this function were 
calculated with Daubechies 8-tap (-D 8 ) wavelets. The dashed curve in the same Figure shows the 
behavior of the dispersion of wavelet coefficients as a function of time at the scale level j = 5 
which happens to be most sensitive to the surge onsetQ. Its precursor is seen as the maximum 
and the subsequent drop of the standard deviation by about 40% which appears about 1-2 sec 
prior the malfunction denoted by the large increase of both the pressure and dispersion at the 



14 It shows that the correlations among successive values of the sampled signal are likely to occur on intervals of 
the order of 30 msec. Let us note that similar correlation lengths are typical for a sampled speech signal as well. 
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right-hand side. Initial part of the dispersion plot is empty because the prescribed interval for 
the compilation of the representative distribution of wavelet coefficients is chosen equal to 1 sec. 
The randomly reordered (shuffled) sample of the same values of the pressure within the pre-surge 
time interval does not show such a drop of the dispersion as demonstrated by the dot-dashed 
line in the Figure. It proves the dynamic origin of the effect. Further analyzed characteristics 
of this process (fractality, high-moments behavior) are discussed in ]73| ]. No fractal properties 
of the signal have been found. The scale j = 5 falls off the linear dependence of the partition 
function on j necessary for fractal behavior. Let us note that the multifractality of the signal may 
fail because of accumulation of points where a chirp-like behavior happens. If it is the origin of 
this effect, it should provide a guide to the description of the dynamics of the analyzed processes. 
The time intervals before and after appearence of the precursor were analyzed separately. Higher 
moments of wavelet coefficients as functions of their rank q behave in a different way at pre- and 
post-precursor time. It indicates on different dynamics in these two regions. 

This method of wavelet analysis can be applied to any engines, motors, turbines, pumps, 
compressors etc. It provides significant improvement in the diagnostics of the operating regimes 
of existing engines, which is important for preventing their failure and, consequently, for lowering 
associated economic losses. Two patents on diagnostics and automatic regulation of engines dated 
19.03.1999 have been obtained by the authors of Ref. |73| . 



There are other problems in aviation which could be approached with wavelet analysis , e.g., 
combustion instabilities, analysis of ions in the jet from the combustion chamber or more general 
problem of metal aging and cracks etc. 



15.3 Medicine and biology 



Applications of the wavelet analysis to medicine and biology (see, e.g., [14], |74], [75]) follow also 
the same procedures as discussed above . They are either deciphering information hidden in one- 
dimensional functions (analysis of heartbeat intervals, ECG, EEG, DNA sequences etc) or pattern 
recognition (shapes of biological objects, blood cell classification etc). 



15.3.1 Hearts 

Intriguing results were obtained ]7fj] when the multiresolution wavelet analysis was applied to 
the sequence of time intervals between human heartbeats. It has been claimed that the clinically 
significant measure of the presence of heart failure is found just from analysis of heartbeat intervals 
alone without knowledge of the full electrocardiogram plot while the previous approaches provided 
statistically significant measures only. Series of about 70000 interbeat intervals were collected for 
each of 27 patients treated. They were plotted versus interval number. Signal fluctuations were 
wavelet transformed and dispersions of wavelet coefficients at different scales were calculated as 
in the case of aviation engines considered above but with averaging over the whole time interval 
because it is not necessary now to study the evolution during this time interval. Thus each patient 
was characterized by a single number (dispersion) at a given scale. It happened that the sets of 
numbers for healthy and heart failure patients did not overlap at the scale^ j = 5 as seen in 
Fig. 18. This is considered as the clinically significant measure in distinction to the statistically 
significant measure when these sets overlap partly. The dispersions for healthy patients are larger 
probably corresponding to a higher flexibility of healthy hearts. The fluctuations are less anti- 
correlated for heart failure than for healthy dynamics. 

15 By coincidence it is the same as for aviation engines. 
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This analysis was questioned by another group J77], [7S| stating that the above results on 
the scale-dependent separation between healthy and pathologic behavior depend on the database 
analyzed and on the analyzing wavelet . They propose new scale-independent measures - the 
exponents characterizing the scaling of the partition function of the wavelet coefficients of the 
heartbeat records. These exponents reveal the multifractal behavior for healthy human heartbeat 
and loss of multifractality for a life-threatening condition, congestive heart failure. This distinction 
is ascribed to nonlinear features of the healthy heartbeat dynamics. The authors claim that the 
multifractal approach to wavelet analysis robustly discriminates healthy subjects from heart-failure 
subjects. In our opinion, further studies are needed. 

Earlier, the Fourier analysis of the heart rate variability disclosed the frequency spectrum of 
the process. Three frequency regions play the main role. The high frequency peak is located about 
0.2 Hz, and the low frequency peak is about 0.1 Hz. The superlow frequency component reminds 
l//-noise with its amplitude strongly increasing at low frequencies. There are some attempts |79| 
to develop theoretical models of such a process with the help of wavelet transform. 

The wavelet analysis of one-dimensional signals is useful also for deciphering the information 
contained in ECG and EEC The functions to be analyzed there are more complicated than those 
in above studies. Some very promising results have been obtained already. In particular, it 
was shown that anomalous effects in ECG appear mostly at rather large scales (low frequencies) 
whereas the normal structures are inclined to comparatively small scales (high frequencies). It 
corresponds to the above results on heartbeat intervals. Such analysis of ECG was reported, e.g., 
in Ref. f80|. We shall not describe it in detail here because these studies are still in their initial 



stage only. The time- frequency analysis of EEG can be found, e.g., in Ref. J81|. It can locate the 
source of the epileptic activity and its propagation in the brain. The general description of EEG 



methodics see, e.g., in Refs. |82], p3| . 

The wavelet analysis has been used also for diagnostics of the embrion state during the preg- 
nancy period [[7I| . The matching pursuit method was developed to fit the properties of the signal 
as well as possible. 

15.3.2 DNA sequences 

The wavelet transform with continuous wavelets has been used for fractal analysis of DNA se- 



quences |54], |5[] in attempt to reveal the nature and the origin of long range correlations in these 
sequences. It is still of debate whether such correlations are different for protein-coding (exonic) 
and noncoding (intronic, intergenic) nucleotide sequences. To graphically portray these sequences 
in the form of one- dimensional functions, the so-called "DNA walk" analysis was applied with the 
conversion of the 4-letter text of DNA into a binary set [jSEfl . Applying wavelet transform to 121 
DNA sequences (with 47 coding and 74 noncoding regions) selected in the human genome, the 
authors of Refs. have been able to show that there really exists the difference between 

the two subsequences. It is demonstrated by the presence of long range correlations in noncod- 
ing sequences while the coding sequences look like uncorrelated random walks. To show this, 
the averaged partition (generating) functions of wavelet coefficients (see Eq.( |105| )) over these two 
statistical samples were calculated. Both of them scaled with the exponent predicted for homo- 
geneous Brownian motions, i.e., r(q) = qH — 1. The main difference between the noncoding and 
coding sequences is the value of H, namely, H nc = 0.59 ± 0.02 and H c = 0.51 ± 0.02 which dis- 
tinguish uncorrelated and correlated subsamples. Moreover, it has been shown that r(q) spectra 
extracted from both sets are surprisingly in remarkable agreement with the theoretical prediction 
for Gaussian processes if the probability density of wavelet coefficients at fixed scales are plotted 
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versus these coefficients scaled with their r.m.s. value at the same scale. The results of wavelet 
analysis clearly show that the purine (A, G) versus pyrimidine (C, T) content^ of DNA is likely 
to be relevant to the long range correlation properties observed in DNA sequences. 

15.3.3 Blood cells 

Another problem which can be solved with the help of wavelet analysis is the recognition of 
different shapes of biological objects. By itself, it has very wide range of applicability. Here, we 
consider the blood cell classification scheme according to the automatic wavelet analysis developed 
by us, and a particular illustration is given for erythrocytes classification. The automatic search, 
stability in determining the cells shapes and high speed of processing can be achieved with the 
computer wavelet analysis . The main idea of the method relies on the fact that at a definite 
resolution scale wavelet analysis reveals clearly the contours of the blood cells what allows to 
classify them. In Fig. 19 we demonstrate how it became possible to improve the image resolution 
by such a method. The low quality image of blood cells has been transformed in its wavelet image 
with clearly seen contours of the cells. 

The shapes of erythrocytes differ for various types of them. Depending on their shape, they 
can play either positive or negative role for a human being. Therefore their differentiation is very 
important. After microscope analysis of a dry blood smear and registration of blood cells in a 
computer, the wavelet analysis of the set of registered blood cells has been done. 

However the extreme random irregularities at some points of a cell contour can prevent from 
performing the analysis . Therefore a special smearing procedure was invented to avoid such points 
without loss of crucial peculiarities typical for a definite type of cells. After that, the set of blood 
cells with somewhat smeared contours is ready for wavelet analysis . It consists in the wavelet 
correlation analysis which shows different behavior of the correlation measures depending on the 
particular cell shapes. Using the expert classification of cells in a definite sample, the wavelet 
characteristics of correlations typical for a particular class were found and then inserted in the 
computer program. This software was used for classification of blood cells obtained from other 
patients, and results were cross-checked by experts. Their positive conclusions are encouraging. 
The whole procedure is now done fast and automatically without human intervention. In Fig. 20 
the blood cells of various kinds are shown. 



15.4 Data compression 

Data compression is needed if one wants, e.g., to store the data spending as low memory capacity 
as possible or to transfer it at a low cost using smaller packages. The example of FBI using wavelet 
analysis for pattern recognition and saving in that way a lot of money on computer storage of 
fingerprints is well known. This is done by omitting small wavelet coefficients after the direct 
wavelet transform was applied. Surely, to restore the information one should be confident in 
stability and good quality of the inverse transformation. Therefore, both analysis and synthesis 
procedures are necessary for the data compression and its subsequent reconstruction. Above, we 
have shown how successful is wavelet analysis in solving many problems. Due to the completeness 
of the wavelet system it is well suited for the proper inverse transform (synthesis) as well (see, 



e.g., Ref. 0). 

The approach to the solution of the problem strongly depends on the actual requirements 
imposed on the final outcome. There are at least three of them. If one wants to keep the 



16 



A, C, G, T denote the usual four letter alphabet of any DNA text. 
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quality of the restored image (film) practically as good as the initial one, the compression should 
not be very strong. It is required, e.g., if the experts should not distinguish the compressed 
and uncompressed copies of a movie when they are shown on two screens in parallel. Another 
requirement would be important if one wants to compress an image as strongly as possible leaving 
it still recognizable. It is required, e.g., if one needs to transfer the information in a line with 
limited capacity. Finally, one can require the whole procedure of analysis and synthesis to be 
done as fast as possible. It is necessary, e.g., if the information must be obtained immediately 
but at lower cost. These three situations ask for different choice of wavelets to optimize analysis 
and synthesis. In all cases wavelet analysis has an advantage over the coding methods which use 
the windowed Fourier transform but the quantitative estimate of this advantage varies with the 
problem solved. 

Let us remind that any image in the computer should be digitized and saved as the bitmap 
or, in other words, as a matrix, each element of which describes the color of the point in the 
original image. The number of elements of the matrix (image points) depends on the resolution 
chosen in the digital procedure. It is a bitmap that is used for the subsequent reproduction of 
the image on a screen, a printer etc. However, it is not desirable to store it in such a form 
because it would ask for the huge computer capacity. That is why, at present, the numerous 
coding algorithms (compression) of a bitmap are developed, whose effectiveness depends on the 
image characteristics^ All these algorithms belong to the two categories - they are either the 
coding with a loss of information or without any loss of it (in that case the original bitmap can 
be completely recovered by the decoding procedure). In a more general applications, the last 
algorithm is often called as the data arxivation. 

As an example, we consider the compression algorithm with a loss of information. "The loss of 
information" means in this case that the restored image is not completely identical to the original 
one but this difference is practically indistinguishable by a human eyeQ. 

At present, most of the computer stored images (in particular, those used in Internet) with the 
continuous tone[^] are coded with the help of the algorithm JPEG (for the detailed description, 
see |88|j ). Main stages of this algorithm are as follows. One splits the image in the matrices with 
8 by 8 dots. For each matrix, the discrete cosine-transform is performed. The obtained frequency 
matrices are subjected to the so-called "quantization" procedure when the most crucial for the 
visual recognition elements are chosen according to a special "weight" table compiled beforehand 
by specialists after their collective decision was achieved. This is the only stage where the "loss" of 
information happens. Then the transformed matrix with the chosen frequencies (scales) becomes 
compact and it is coded by the so-called entropy method (also called arithmetic or Huffmann 
method). 

Applied above algorithm differs from the described one by use of the wavelets instead the 
windowed cosine-transform and by the transform of the whole image instead the 8 by 8 matrix 
only. Fig. 21 demonstrates the original image and its two final images restored after the similar 
compression according to JPEG and wavelet algorithms. It is easily seen that the quality of the 

17 It is evident that to store the " Black square" , one does not have to deal with a matrix of all black dots but 
must store just the three numbers showing the width, the height and the color. This is the simplest example of 
the image compression. 

18 The "compression quality" is usually characterized by a parameter which varies from to 100. Here 100 implies 
the minimal compression (the best quality) , and the restored image is practically indistinguishable from the initial 
one, while means the maximum compression which still allows to distinguish some details of the original image 
in the recovered one. 

19 that is the images contained many slightly different colors. Ordinarily to store such images in a computer 16 
millions colors per pixel are used 
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wavelet image is noticeably higher than for JPEG at practically the same size of the coded files. 
The requirement of the same quality for both algorithms leads to the file sizes 1.5-2 times smaller 
for the wavelet algorithm that could be crucial for the transmission of the image, especially if the 
transmission line capacity is limited. 

15.5 Microscope focusing 

Surely, the problem of microscope focusing is tightly related to pattern recognition. One should 
resolve well focused image from that with diffused contours. It is a comparatively easy task for 
wavelets because in the former case the image gradients at the contour are quite high while in the 
latest one they become rather vague. Therefore the wavelet coefficients are larger when microscope 
is well focused to the object and drastically decrease with defocusing. At a definite resolution level 
corresponding to the contour scale the defocusing effect is strongest. It is demonstrated in Fig. 
22. The peak of the wavelet coefficients shows the most focused image. After doing the wavelet 
analysis of an image, the computer sends a command to shift the microscope so that larger values 
of the wavelet coefficients and thus better focusing are attained. At other levels it is somewhat less 
pronounced and, moreover, it is asymmetric depending on whether the microscope is positioned 
above the focus location or below it. This asymmetry has been used for the automatic microscope 
focusing with a well defined direction of its movement toward the focus location. 

16 Conclusions 

The beauty of the mathematical construction of the wavelet transformation and its utility in 
practical applications attract nowadays researchers from both pure and applied science. Moreover, 
commercial outcome of this research becomes quite important. We have outlined a minor part of 
activity in this field. However we hope that the general trends in the development of this subject 
became comprehended and appreciated. 

Unique mathematical properties of wavelets made them a very powerful tool in analysis and 
subsequent synthesis of any signal. The orthogonality property allows to get an independent infor- 
mation from different scales. Normalization assures that at different steps of this transformation 
the value of information is not changed and mixed up. The property of localization helps get 
knowledge about those regions in which definite scales (frequencies) play most important role. 
Finally, the completeness of the system of wavelet functions formed by dilations and translations 
results in validity of the inverse transformation. 

The multiresolution analysis leads to fast wavelet transform and together with the procedure 
of the nonstandard matrix multiplication to rather effective computing. Analytic properties of 
functions, local and global Holder indices, multifractal dimensions etc are subject to wavelet 
analysis . The natural accomodation of differential operators in this enviroment opens a way to 
effective solution of differential equations. 

All these properties enable us to analyze through the wavelet transform complex signals at 
different scales and locations, to solve equations describing extremely complicated nonlinear sys- 
tems involving interactions at many scales, to study very singular functions etc. The wavelet 
transform is easily generalized to any dimension, and multidimensional objects can be analyzed 
as well. Therefore, wavelets are indispensable for pattern recognition. 

Thus the wavelet applications in various fields are numerous and give nowadays very fruitful 
outcome. In this review paper we managed to describe only some of these applications leaving 



43 



aside the main bulk of them. The potentialities of wavelets are still not used at their full strength. 

However one should not cherish vain hopes that this machinery works automatically in all 
situations by using its internal logic and does not require any intuition. According to Meyer ||, 
"no 'universal algorithm' is appropriate for the extreme diversity of the situations encountered". 
Actually it needs a lot of experience in choosing the proper wavelets, in suitable formulation of the 
problem under investigation, in considering most important scales and characteristics describing 
the analyzed signal, in the proper choice of the algorithms (i.e., the methodology) used, in studying 
the intervening singularities, in avoiding possible instabilities etc. By this remark we would not 
like to prevent newcomers from entering the field but, quite to the contrary, to attract those who 
are not afraid of hard but exciting research and experience. 

Appendix 1 

The general approach which respects all properties required from wavelets is known as the 
multiresolution approximation and defined mathematically in the following way: 
DEFINITION §] 

A multiresolution approximation of L 2 (R n ) is, by definition, an increasing sequence Vj, j G Z, 
of closed linear subspaces of L 2 (R n ) with the following properties: 

1- fToo V- = {0}, IToo V 3 is dense in L 2 (R n ); 

2. for all / G L 2 (R n ) and all j G Z n 

f(x) G Vj <- f(2x) G V j+1 ; 

3. for all / G L 2 (R n ) and all k G Z n 

f{x) g IW f(x -k)e V ; 

4. there exists a function, g(x) G Vo, such that the sequence g(x — k), k G Z n , is an orthonormal 
(or Riesz) basis of the space Vq. It is clear, that scaling versions of the function g{x) form 
bases for all Vj. 

Thus the multiresolution analysis consists of a sequence of successive approximation spaces Vj 
which are scaled and invariant under integer translation versions of the central space Vq. There 
exists an orthonormal or, more generally, Riesz basis in this space. The Haar multiresolution 
analysis can be written in these terms as 

Vj = {/ G L 2 (R); for all k G Z : /| [fc2 -i ,(k+i)2-J) = const}. (116) 

In Fig. 3 we showed what the projections of some / on the Haar spaces Vo, V\ might look 
like. The general distributions are decomposed into series of correctly localized fluctuations of a 
characteristic form defined by the wavelet chosen. 

The functions (fj^ form an orthonormal basis of Vj. The orthogonal complement of Vj in Vj+i 
is called Wj. The subspaces Wj form a mutually orthogonal set. The sequence of ipj^ constitutes 
an orthonormal basis for Wj at any definite j. The whole collection of ipj^ and ipj^ for all j 
is an orthonormal basis for L 2 (R). This ensures us that we have constructed a multiresolution 
analysis approach, and the functions ipj,k and ipj^ constitute the small and large scale filters, 
correspondingly. The whole procedure of the multiresolution analysis has been demonstrated in 
graphs of Fig. 4. 
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In accordance with the above formulated goal, one can define the notion of wavelets in the 
following way: 

DEFINITION @] 

A function ip(x) of a real variable is called a (basic) wavelet of class m if the following four 
properties hold: 

1. if m = 0, ip(x) and <p(x) belong to L°°(R); if m > l,ip{x), f(x) and all their derivatives up 
to order m belong to L°°(R); 

2. ip(x), <f(x) and all their derivatives up to order m decrease rapidly as x — > ±oo; 

3. /f^ dxx n ip(x) = for < n < m, and /f^ dxip(x) = 1; 

4. the collection of functions 2^ 2 ip{2^x — k), 2^' 2 cp(2^x — k), j, k G Z, is an orthonormal basis 
ofL 2 GR). 



Then the equation (22) is valid. If ip and <p both have compact support, then it gives a 
decomposition of any distribution of order less than m. Moreover, the order of the distribution / 
(the nature of its singularities) can be calculated exactly and directly from the size of its wavelet 
coefficients as has been shown in Sections 11, 12. The functions 2^ 2 ip(2 : 'x — k) are the wavelets 
(generated by the "mother" ip), and the conditions 1., 2., 3. express, respectively, the regularity, 
the localization and the oscillatory character. One sees that the property 3. is satisfied for the 
Haar wavelets for m = only, i.e., its regularity is r = 0. In general, for each integer r > 1, 
there exists a multiresolution approximation Vj of L 2 (R) which is r-regular and such that the 
associated real- valued functions (p and ip have compact support. As the regularity increases, 
so do the supports of (p and ip. The wavelet 2- J ' 2 ^(2 :7 x — k) is "essentially concentrated" on 
the dyadic interval / = [k2~i , {k + 1)2 _J ') for j,k G Z. Its Fourier transform is supported by 
2 J (27r/3) < \u\ < 2 J (87r/3). In fact, it has a one octave frequency range. 

Appendix 2 

In the wavelet analysis of operator expressions the integral Calderon-Zygmund operators are 
often used. There are several definitions of them (see, e.g., the monograph ||). We give here their 
definition used by Daubechies 0. 

DEFINITION 

A Calderon-Zygmund operator T on R is an integral operator 

Tf(x)= J ' dyK(x,y)f(y) (117) 
for which the integral kernel satisfies 

\K(x,y)\<— — , (118) 
\x - y\ 

\^-K(x,y)\ + \^-K(x,y)\ < — (119) 
ox oy \x — y\ z 

and which defines a bounded operator on L 2 (R). 

With such a definition the special care should be taken at x = y. 
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Appendix 3 

In the literature devoted to the signal processing, the so-called Littlewood-Paley decomposition 
is often used. It is closely related to the wavelet transform. Therefore we give here the dictionary 
which relates the Littlewood-Paley coefficients denoted as Aj(f)(x) to both discrete (dj :k ) and 
continuous (W(a,b)) wavelet coefficients. 

Aj(f)(x) = r' %. 2l , = W(2~t, x), (120) 

or 

d hk = 2~ nj/2 W(2~ j , 2~ j k) = 2- nj/2 A J (f)(2- j k). (121) 
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Figure Captions 

Fig. 1. The histogram and its wavelet decomposition. 
The initial histogram is shown in the upper part of the Figure. It corresponds to the level j = 4 
with 16 bins (Eq. (0)). The intervals are labelled on the abscissa axis at their left-hand sides. The 
next level j = 3 is shown below. The mean values over two neighboring intervals of the previous 
level are shown at the left-hand side. They correspond to eight terms in the first sum in Eq. 
(f|). At the right-hand side, the wavelet coefficients ds,k are shown. Other graphs for the levels 
j = 2, 1, are obtained in a similar way. 

Fig. 2. The Haar scaling function (p(x) = (po t o(x) and "mother" wavelet ip(x) = ipo^(x). 

Fig. 3. The analyzed function (a) and its Haar projections onto two subsequent spaces V. 

Fig. 4. The graphical representation of the multiresolution analysis with decomposition of 
Vj + i space onto its subspace Vj and the orthogonal complement Wj iterated to the lower levels. 

Fig. 5. Daubechies scaling functions (solid lines) and wavelets (dotted lines) for M = 2,4. 

Fig. 6. The fast wavelet transform algorithm. 

Fig. 7. Coiflets (dotted lines) and their scaling functions (solid lines) for M — 4. 
Fig. 8. The wavelet packet construction. 

Fig. 9. The example of the wavelet analysis of a two-dimensional plot. 
One sees that either horizontal or vertical details of the plot are more clearly resolved in the corre- 
sponding coefficients. Also, the small or large size (correlation length) details are better resolved 
depending on the level chosen. 

Fig. 10. The lattices of time-frequency localization for the wavelet transform (left) and win- 
dowed Fourier transform (right). 

Fig. 11. The matrix representation of the standard approach to the wavelet analysis . 
The parts containing non-zero wavelet coefficients are shaded. 

Fig. 12. The nonstandard matrix multiplication in the wavelet analysis . 

Fig. 13. The Coulomb potential V (the solid line) as a function of the distance and the preci- 
sion of its calculation AV^ (the dashed line) for the Uranium dimer in arbitrary units (a.u). 
It is drawn for the direction from one atom to another one in the left-hand side and for the op- 
posite direction in the right-hand side. The complete symmetry is seen with a larger distances 
shown in the right-hand side. 

Fig. 14. Fast decrease of energy recipies with respect to the number of iterations demonstrates 
good convergence of the solution of the density functional theory equations by the wavelet methods. 
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Fig. 15. The same as in Fig. 14 for the absolute values of the wavelet coefficients. 

Fig. 16. The restored images of long-range correlations in experimental target diagrams. 
They show the typical ring-like structure in some events of central Pb-Pb interactions. 

Fig. 17. The signal of the pressure sensor (the solid line) and the dispersion of its wavelet 
coefficients (the dashed line). 

The time variation of the pressure in the engine compressor (the irregular solid line) has been 
wavelet analyzed. The dispersion of the wavelet coefficients (the dashed line) shows the maximum 
and the remarkable drop prior the drastic increase of the pressure providing the precursor of this 
malfunction. The shuffled set of the data does not show such an effect for the dispersion of the 
wavelet coefficients (the upper curve) pointing to its dynamic origin. 

Fig. 18. The sets of the values of the dispersions of the wavelet coefficients for the heartbeat 
intervals of the healthy (white circles) and heart failure (black circles) patients are shown. 
They do not overlap at j = 4. 

Fig. 19. a) The photo of blood cell, obtained from the microscope, 
b) The same photo after the wavelet analysis done. 

The blurred image of a blood cell (left) becomes clearly visible one (right) after the wavelet trans- 
form. 

Fig. 20. The classification of the erythrocytes cells. 

Fig. 21. a) The original photo (the file size is 461760 bytes). 

b) The photo reconstructed after compression according to the JPEG- algorithm, (the file size is 
3511 bytes). 

c) The photo reconstructed after compression according to the wavelet algorithm (the file size is 
3519 bytes). 

Better quality of the wavelet transform is clearly seen when comparing the original image (left) 
and two images restored after the similar compression by the windowed Fourier transform (middle) 
and the wavelet transform (right). 

Fig. 22. The focusing line. 
The values of the wavelet coefficients calculated for different positions of the microscope are shown 
on the vertical axis. The corresponding positions are numbered on the horizontal axis. The best 
focusing is obtained at the maximum of the curve (positions 6 - 10). Large wavelet coefficients 
correspond to the better focused image. The microscope moves to the focus position being driven 
by computer commands for increase of the wavelet coefficients of the analyzed objects. 
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